Descriptor: "Computer vision" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Computer vision"' showing total 262,009 results

Start Over Descriptor "Computer vision" Search Limiters Full Text

262,009 results on '"Computer vision"'

1. A Computer Vision Methodology to Predict Brand Personality from Image Features.

Author: Peng, Yilang, Wen, Taylor Jing, and Yang, Jing
Subjects: BRAND personification, COMPUTER vision, BRANDING (Marketing), RESEARCH personnel, ADVERTISING
Abstract: Using the computer vision method, this study proposes an analytical model of visual aesthetics for brand communication and analyzes the effects of visual features (i.e., colors and visual complexity) on brand personality. This study illustrates a four-step procedure correlating computationally coded visual attributes with human ratings of perceived brand personality. This study has important methodological implications for advertising researchers and practitioners. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. The Role of Computing in the Study of Latin American Cultural Heritage.

Author: Sipiran, Ivan
Subjects: *DEEP learning, *HISTORIC preservation, *HISTORICAL archaeology, *ARCHAEOLOGICAL research, *CULTURAL property, *COMPUTER vision
Abstract: This article discusses the importance of computational technology in preserving and conserving Latin American cultural heritage. Deep Learning and data-driven methods have recently been used to assist in digitizing pre-Columbian objects and computer-vision methods could potentially be applied to the discovery of new geoglyphs in Peru. The authors, in collaboration with the Larco Museum in Lima, have begun to develop computational methods to support this and other important archaeological and conservation efforts.
Published: 2024
Full Text: View/download PDF

3. Integrating Network Clustering Analysis and Computational Methods to Understand Communication With and About Brands: Opportunities and Challenges.

Author: Himelboim, Itai, Maslowska, Ewa, and Araujo, Theo
Subjects: BRANDING (Marketing), CLUSTER analysis (Statistics), IMAGE analysis, COMPUTER vision, BRAND communities, SENTIMENT analysis, SOCIAL media
Abstract: Brand-related content cocreated by consumers can play a crucial role in brand–consumer interactions and provide brands with valuable insights hidden in vast seas of unstructured data. We propose and evaluate a framework integrating a social network approach and scalable automated content analysis of texts and visuals for studying brand-related communication on social media. To illustrate the proposed approach, we use Twitter content related to two brands: Barclays and Sierra Club. By applying network clustering algorithms we identify different types of organically emerging communities around brands. Cluster-specific diffusion leaders are identified using their in-degree centrality values. To examine the unique characteristics of brand-related content within each cluster, we apply and assess the accuracy of popular off-the-shelf solutions for text and image analysis, also known as application programming interfaces (APIs). Of six sentiment analysis solutions, only one shows acceptable reliability levels. For computer vision APIs, we first identify labels that have unclear or imprecise meaning and calculate accuracy levels, resulting in acceptable accuracy levels for four of the five APIs. We discuss conceptual and practical implications of this integrative approach and of the technological hurdles that these popular automated content analysis applications pose. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Computer Vision, ML, and AI in the Study of Fine Art.

Author: Stork, David G.
Subjects: *ARTS, *PAINTING, *ART history, *COMPUTER vision, *MACHINE learning, *ARTIFICIAL intelligence, *ARTIFICIAL neural networks, *IMAGE analysis
Abstract: This article reviews how computer vision, machine learning and artificial intelligence (AI) have already contributed to the study of fine arts as well as ongoing problems in art analysis with these techniques. The article begins with the contributions, including resolving art history debates. Then the article delves into areas in art that make AI research difficult including the development of artistic works in multiple layers, small datasets, and the lack of a uniform definition of style in fine art.
Published: 2024
Full Text: View/download PDF

5. Bee Species Identification: Improving Population Monitoring Techniques

Author: Rink, Jennifer, Kavuluru, Rohit, Golich, Dannah, Moon, Patrick, Rajagopal, Navneet, Huang, Jiashu, Seltmann, Katja, Ostwald, Madeleine, and Baracaldo Lancheros, Laura
Subjects: classification, neural network, perspective correction, species identification, computer vision, wing morphology, Python, ARuCO markers, preprocessing pipeline, VGG-16, web application, linear discriminant analysis, spectral embedding, landmarks, k-nearest neighbors, unsupervised learning, data science
Abstract: This project aims to mitigate the critical decline in bee populations, essential for crop pollination and food security. With a shortage of taxonomic specialists to identify the vast array of bee species, the project's goal is to enhance the monitoring of population changes through an automated classification system. Utilizing a dataset of bee wing images, the project aims to develop a computational pipeline to identify species based on their unique wing vein patterns. This approach not only supports bee conservation efforts but also expands our understanding of complex geometric variations in nature, offering wider applications in biological research. This poster was presented at the UCSB Data Science Capstone showcase in 2024.
Published: 2024

6. Deep match: A zero-shot framework for improved fiducial-free respiratory motion tracking

Author: Xu, Di, Descovich, Martina, Liu, Hengjie, Lao, Yi, Gottschalk, Alexander R, and Sheng, Ke
Subjects: Medical and Biological Physics, Biomedical and Clinical Sciences, Clinical Sciences, Physical Sciences, Oncology and Carcinogenesis, Cancer, Lung, Bioengineering, Lung Cancer, SBRT, CyberKnife, Xsight Lung Tracking, Template matching, Deep learning, Computer vision, Other Physical Sciences, Oncology & Carcinogenesis, Clinical sciences, Oncology and carcinogenesis, Medical and biological physics
Abstract: Background and purposeMotion management is essential to reduce normal tissue exposure and maintain adequate tumor dose in lung stereotactic body radiation therapy (SBRT). Lung SBRT using an articulated robotic arm allows dynamic tracking during radiation dose delivery. Two stereoscopic X-ray tracking modes are available - fiducial-based and fiducial-free tracking. Although X-ray detection of implanted fiducials is robust, the implantation procedure is invasive and inapplicable to some patients and tumor locations. Fiducial-free tracking relies on tumor contrast, which challenges the existing tracking algorithms for small (e.g., 15 mm) and tumor locations (with/without thoracic anatomy overlapping).ResultsOn X-ray views that conventional methods failed to track the lung tumor, Deep Match achieved robust performance as evidenced by >80 % 3 mm-Hit (detection within 3 mm superior/inferior margin from ground truth) for 70 % of patients and
Published: 2024

7. Pattern Recognition for Curb Usage

Author: Arcak, Murat, PhD and Kurzhanskiy, Alexander A., PhD
Subjects: Curb side parking, computer vision, visual texture recognition, data collection, cameras, GPS, demonstration project
Abstract: The increasing use of transportation network companies and delivery services has transformed the utilization of curb space, resulting in a lack of parking and contributing to congestion. No systematic method exists for identifying curb usage patterns, but emerging machine learning technologies and low-tech data sources, such as dashboard cameras mounted on vehicles that routinely travel the area, have the potential of monitoring curb usage. To demonstrate how video data can be used to recognize usage patterns, we conducted a case study on Bancroft Way in Berkeley, CA. The project collected video footage with GPS data from a dashboard camera installed on a shuttle bus that circles the area. We trained a machine learning model to recognize different types of delivery vehicles in the data images, and then used the model to visualize curbside usage trends. The findings include identifying hot spots, analyzing arrival patterns by delivery vehicle type, detecting bus lane blockage, and assessing the impact of parking on traffic flow. The proof-ofconcept study demonstrated that machine learning techniques, when coupled with affordable hardware like a dashboard camera, can reveal curb usage patterns. The data can be used to efficiently manage curb space, facilitate goods movement, improve traffic flow, and enhance safety.
Published: 2024

8. Use of laser technology for the postural classification of bedridden people.

Author: Canzobre, David S., Torrado, Pablo Pardiñas, Vigo, Javier Lamas, and Rego, Alberto Ramil
Subjects: *BEDRIDDEN persons, *POSTURE, *COMPUTER vision, *MEDICAL personnel, *SKIN ulcers
Abstract: This work presents an innovative method for the automated classification of postures of bedridden people using laser technology. With the aim of improving the quality of medical care and facilitating the continuous monitoring of patients in clinical and home environments, a system is proposed that uses laser sensors to capture the three-dimensional geometry of body postures. Through advanced computer vision processing techniques, a robust classification algorithm is developed capable of identifying and categorizing lateral decubitus postures, also known as lateral safety position, which are commonly used to lay down patients who are permanently bedridden. Experimental results, performed so far, show sufficient accuracy in posture classification, suggesting the potential of this technology to improve the monitoring and care of bedridden patients, while providing periodic warnings to medical staff when a patient has exceeded the recommended time in the same posture, to avoid the appearance of skin ulcers. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Strategies to alleviate flickering: Bayesian and smoothing methods for deep learning classification in video

Author: Miller, Noah, Drumm, Glen Ryan, Champagne, Lance, Cox, Bruce, and Bihl, Trevor
Published: 2024
Full Text: View/download PDF

10. Designing an artificial intelligence-powered video assistant referee system for team sports using computer vision.

Author: Zhekambayeva, Maigul, Yerekesheva, Meruert, Ramashov, Nurmambek, Seidakhmetov, Yermek, and Kulambayev, Bakhytzhan
Subjects: SPORTS & technology, SPORTS officiating, COHEN'S kappa coefficient (Statistics), TEAM sports, COMPUTER vision
Abstract: Copyright of Retos: Nuevas Perspectivas de Educación Física, Deporte y Recreación is the property of Federacion Espanola de Asociaciones de Docentes de Educacion Fisica and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

11. Image processing framework for in-process shaft diameter measurement on legacy manual machines.

Author: Choudhari, Sahil J., Singh, Swarit Anand, Kumar, Aitha Sudheer, and Desai, Kaushal A.
Subjects: *OBJECT recognition (Computer vision), *COMPUTER vision, *IMAGE processing, *LATHES, *MACHINING, *DEEP learning
Abstract: In-process dimension measurement is critical to achieving higher productivity and realizing smart manufacturing goals during machining operations. Vision-based systems have significant potential to serve for in-process dimensions measurements, reduce human interventions, and achieve manufacturing-inspection integration. This paper presents early research on developing a vision-based system for in-process dimension measurement of machined cylindrical components utilizing image-processing techniques. The challenges with in-process dimension measurement are addressed by combining a deep learning-based object detection model, You Only Look Once version 2 (YOLOv2), and image processing algorithms for object localization, segmentation, and spatial pixel estimation. An automated image pixel calibration approach is incorporated to improve algorithm robustness. The image acquisition hardware and the real-time image processing framework are integrated to demonstrate the working of the proposed system by considering a case study of in-process stepped shaft diameter measurement. The system implementation on a manual lathe demonstrated robust utilities, eliminating the need for manual intermittent measurements, digitized in-process component dimensions, and improved machining productivity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. A prototype smartphone jaw tracking application to quantitatively model tooth contact.

Author: Armstrong, Kieran, Kincade, Carolyn, Osswald, Martin, Rieger, Jana, and Aalto, Daniel
Subjects: JAWS, PROTOTYPES, TEETH, COMPUTER vision, DENTAL occlusion
Abstract: This study utilised a prototype system which consisted of a person-specific 3D printed jaw tracking harness interfacing with the maxillary and mandibular teeth and custom jaw tracking software implemented on a smartphone. The prototype achieved acceptable results. The prototype demonstrated a static position accuracy of less than 1 mm and 5°. It successfully tracked 30 cycles of a protrusive excursion, left lateral excursion, and 40 mm of jaw opening on a semi-adjustable articulator. The standard error of the tracking accuracy was reported as 0.1377 mm, 0.0449 mm, and 0.9196 mm, with corresponding ${r^2}$ r 2 values of 0.98, 1.00, and 1.00, respectively. Finally, occlusal contacts of left, right, and protrusive excursions were tracked with the prototype system and their trajectories were used to demonstrate kinematic modelling (no occlusal forces) with a biomechanical simulation tool. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Tool Wear Classification Based on Support Vector Machine and Deep Learning Models.

Author: Yung-Hsiang Hung, Mei-Ling Huang, Wen-Pai Wang, and Hsiao-Dan Hsieh
Subjects: MACHINE learning, SUPPORT vector machines, CONVOLUTIONAL neural networks, DEEP learning, COMPUTER vision, MICROSCOPES, IMAGE recognition (Computer vision)
Abstract: Tool status is crucial for maintaining workpiece quality during machine processing. Tool wear, an inevitable occurrence, can degrade the workpiece surface and even cause damage if it becomes severe. In extreme cases, it can also shorten the machine tool service life. Therefore, accurately assessing tool wear to avoid unnecessary production costs is essential. We present a wear classification model using machine vision to analyze tool images. The model categorizes wear images on the basis of predefined wear levels to assess tool life. The research involves capturing images of the tool from three angles using a digital microscope, followed by image preprocessing. Wear measurement is performed using three methods: gray-scale value, graylevel co-occurrence matrix, and area detection. The K-means clustering technique is then applied to group the wear data from these images, and the final wear classification is determined by analyzing the results of the three methods. Additionally, we compare the recognition accuracies of two models: support vector machine (SVM) and convolutional neural network (CNN). The experimental results indicate that, within the same tool image sample space, the CNN model achieves an accuracy of more than 93% in all three directions, whereas the accuracy of the SVM model, affected by the number of samples, has a maximum of only 89.8%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Study on visual localization and evaluation of automatic freshwater fish cutting system based on deep learning framework.

Author: Peng, Xianhui, Chen, Yan, Fu, Dandan, Jiang, Yajun, and Hu, Zhigang
Subjects: *ARTIFICIAL neural networks, *COMPUTER vision, *CASCADE connections, *FRESHWATER fishes, *SIZE of fishes
Abstract: Pre-treatment processing technology plays a crucial role in the overall freshwater fish processing procedure, and automatic head and tail cutting stands out as a significant pre-treatment technique within the industry. The system for removing the head and tail of freshwater fish comprised a Cartesian coordinate manipulator, a fish transfer device, a control system, and an image acquisition device. In the vision system, five image segmentation methods were utilized for fish head and tail image segmentation comparison tests. These methods include U-Net (U-shaped Deep Neural Network), DeeplabV3, PSPNet (Pyramid Scene Parsing Network), FastSCNN (Fast Semantic Segmentation Network), and ICNet (Image Cascade Network), all of which were employed to evaluate their performance. Among the tested segmentation methods, the ICNet demonstrated the most excellent segmentation capability. The experimental results indicated a segmentation accuracy of 99.01%, a mean intersection over union (MIoU) of 82.50%, and an image processing time of 15.25 ms. The results showed that the fish head and tail were successfully cut off using this model for recognition with a circular knife. Consequently, the segmentation model employed in the machine vision system within this study has demonstrated successful applicability in automatically cutting the heads and tails of freshwater fish of various sizes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Generalized Out-of-Distribution Detection: A Survey.

Author: Yang, Jingkang, Zhou, Kaiyang, Li, Yixuan, and Liu, Ziwei
Subjects: *COMPUTER vision, *OUTLIER detection, *SCIENTIFIC community, *MACHINE learning, *TRUST
Abstract: Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen during training time and cannot make a safe decision. The term, OOD detection, first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD), are closely related to OOD detection in terms of motivation and methodology. Despite common goals, these topics develop in isolation, and their subtle differences in definition and problem setting often confuse readers and practitioners. In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e.,AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. Despite comprehensive surveys of related fields, the summarization of OOD detection methods remains incomplete and requires further advancement. This paper specifically addresses the gap in recent technical developments in the field of OOD detection. It also provides a comprehensive discussion of representative methods from other sub-tasks and how they relate to and inspire the development of OOD detection methods. The survey concludes by identifying open challenges and potential research directions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Inter-feature Relationship Certifies Robust Generalization of Adversarial Training.

Author: Zhang, Shufei, Qian, Zhuang, Huang, Kaizhu, Wang, Qiu-Feng, Gu, Bin, Xiong, Huan, and Yi, Xinping
Subjects: *COMPUTER vision, *MACHINE learning, *COMPUTER simulation, *GENERALIZATION, *WISDOM
Abstract: Whilst adversarial training has been shown as a promising wisdom to promote model robustness in computer vision and machine learning, adversarially trained models often suffer from poor robust generalization on unseen adversarial examples. Namely, there still remains a big gap between the performance on training and test adversarial examples. In this paper, we propose to tackle this issue from a new perspective of the inter-feature relationship. Specifically, we aim to generate adversarial examples which maximize the loss function while maintaining the inter-feature relationship of natural data as well as penalizing the correlation distance between natural features and adversarial counterparts. As a key contribution, we prove that training with such examples while penalizing the distance between correlations can help promote both the generalization on natural and adversarial examples theoretically. We empirically validate our method through extensive experiments over different vision datasets (CIFAR-10, CIFAR-100, and SVHN), against several competitive methods. Our method substantially outperforms the baseline adversarial training by a large margin, especially for PGD20 on CIFAR-10, CIFAR-100, and SVHN with around 20%, 15% and 29% improvements. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Proposal of simultaneous localization and mapping for mobile robots indoor environments using Petri nets and computer vision.

Author: Mota, Francisco A. X., Batista, Josias G., and Alexandria, Auzuir R.
Subjects: *PETRI nets, *ROBOT vision, *COMPUTER vision, *ROBOTICS, *SUPPLY & demand, *MOBILE robots
Abstract: Studies in the area of mobile robotics have advanced in recent years, mainly due to the evolution of technology and the growing need for automated and dynamic solutions in sectors such as industry, transport, and agriculture. These devices are complex and the ideal method for localizing, mapping, and navigating autonomous mobile robots changes depending on the application. Thus, the general objective of this work is to propose a simultaneous localization and mapping method for autonomous mobile robots in indoor environments, using computer vision (CV) and Petri net (PN). A landmark was placed next to each door in the analyzed region and images were acquired as the rooms in the environment were explored. The algorithm processes the images to count and identify the doors. A transition is created in the PN for each door found and the rooms connected by these doors are represented by the places in the PN. Then, one of the doors is crossed, new images are obtained and the process is repeated until all rooms are explored. The algorithm generates a PN, which can be represented by an image file (.png) and a file with the extension.pnml. The results compare the layout of four environments with the respective generated PNs. Furthermore, six evaluation criteria are proposed for validating Petri nets as a topological map of environments. It is concluded that using PN for this purpose presents originality and potential innovation, being a SLAM technique for indoor environments, which demands low computational cost. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Machine vision system for automatic defect detection of ultrasound probes.

Author: Profili, Andrea, Magherini, Roberto, Servi, Michaela, Spezia, Fabrizio, Gemmiti, Daniele, and Volpe, Yary
Subjects: *COMPUTER vision, *ARTIFICIAL neural networks, *IMAGE processing, *ARTIFICIAL intelligence, *VIRTUAL prototypes
Abstract: Industry 4.0 conceptualizes the automation of processes through the introduction of technologies such as artificial intelligence and advanced robotics, resulting in a significant production improvement. Detecting defects in the production process, predicting mechanical malfunctions in the assembly line, and identifying defects of the final product are just a few examples of applications of these technologies. In this context, this work focuses on the detection of ultrasound probes' surface defects, with a focus on Esaote S.p.A.'s production line probes. To date, this control is performed manually and therefore biased by many factors such as surface morphology, color, size of the defect, and by lighting conditions (which can cause reflections preventing detection). To overcome these shortfalls, this work proposes a fully automatic machine vision system for surface acquisition of ultrasound probes coupled with an automated defect detection system that leverage artificial intelligence. The paper addresses two crucial steps: (i) the development of the acquisition system (i.e., selection of the acquisition device, analysis of the illumination system, and design of the camera handling system); (ii) the analysis of neural network models for defect detection and classification by comparing three possible solutions (i.e., MMSD-Net, ResNet, EfficientNet). The results suggest that the developed system has the potential to be used as a defect detection tool in the production line (full image acquisition cycle takes ~ 200 s), with the best detection accuracy obtained with the EfficientNet model being 98.63% and a classification accuracy of 81.90%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. A Hybrid Convolutional and Graph Neural Network for Human Action Detection in Static Images.

Author: Lu, Xinbiao and Xing, Hao
Subjects: *CONVOLUTIONAL neural networks, *GRAPH neural networks, *HUMAN behavior, *COMPUTER vision, *CLASS actions
Abstract: Human action detection in static images is a hot and challenging field within computer vision. Given the limited features of a single image, achieving precision detection results require the full utilization of the image's intrinsic features, as well as the integration of methods from other fields to process the images for generating additional features. In this paper, we propose a novel dual pathway model for action detection, whose main pathway employs a convolutional neural network to extract image features and predict the probability of the image belonging to each respective action. Meanwhile, the auxiliary pathway uses a pose estimate algorithm to obtain human key points and connection information for constructing a graphical human model for each image. These graphical models are then transformed into graph data and input into a graph neural network for features extracting and probability prediction. Finally, a corresponding connected neural network propose by us is used to fusing the probability vectors generated from the two pathways, which learns the weight of each action class in each vector to enable their subsequent fusion. It is noted that transfer learning is also used in our model to improve the training speed and detection accuracy of it. Experimental results upon three challenging datasets: Stanford40, PPMI and MPII illustrate the superiority of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. A Texture Removal Method for Surface Defect Detection in Machining.

Author: Yu, Xiaofeng, Li, Zhengminqing, Li, Letian, and Sheng, Wei
Subjects: *SURFACE defects, *SURFACE texture, *COMPUTER vision, *SPECTRUM analysis, *SURFACE cracks, *PIXELS
Abstract: Surface defect detection in mechanical processing mainly adopts manual inspection, which has certain issues including strong dependence on manual experience, low efficiency, and difficulty in online detection. A surface texture elimination method based on improved frequency domain filtering in conjunction with morphological sub-pixel edge detection is put forward in order to address the aforementioned issues with machining surface defects. Firstly, ascertain whether textures exist in the image and determine their feature values using the grayscale co-occurrence matrix. The main energy direction of the textured surface in the frequency domain was then obtained by applying the Fourier transform to the processed surface. An elliptical domain narrow stopband was designed to reduce the energy in the band region corresponding to the processed surface texture and eliminate the processed surface texture. Finally, improve morphology and sub-pixel edge fusion to extract surface defect images. Cracks and scratches have a detectable width of 0.01 mm, a detection accuracy of 97.667%, and a detection time of 0.02 s. Therefore, the combination of machine vision and texture removal technology has achieved the detection of surface scratches and cracks in machining, providing a theoretical basis for defect detection in workpiece processing. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. COMPUTER VISION BASED EARLY FIRE-DETECTION AND FIREFIGHTING MOBILE ROBOTS ORIENTED FOR ONSITE CONSTRUCTION.

Author: Liulin KONG, Jichao LI, Shengyu GUO, Xiaojie ZHOU, and Di WU
Subjects: *COMPUTER vision, *FIRE detectors, *MOBILE robots, *BUILDING sites, *OCCUPATIONAL mortality
Abstract: Fires are one of the most dangerous hazards and the leading cause of death in construction sites. This paper proposes a video-based firefighting mobile robot (FFMR), which is designed to patrol the desired territory and will constantly observe for fire-related events to make sure the camera without any occlusions. Once a fire is detected, the early warning system will send sound and light signals instantly and the FFMR moves to the right place to fight the fire source using the extinguisher. To improve the accuracy and speed of fire detection, an improved YOLOv3-Tiny (namely as YOLOv3-Tiny-S) model is proposed by optimizing its network structure, introducing a Spatial Pyramid Pooling (SPP) module, and refining the multi-scale anchor mechanism. The experiments show the proposed YOLOv3-Tiny-S model based FFMR can detect a small fire target with relatively higher accuracy and faster speed under the occlusions by outdoor environment. The proposed FFMR can be helpful to disaster management systems, avoiding huge ecological and economic losses, as well as saving a lot of human lives. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Perspectives on benchmarking foundation models for network biology.

Author: Theodoris, Christina V.
Subjects: *COMPUTER vision, *NATURAL languages, *PROGRAMMING languages, *BIOLOGY, *ADOPTION
Abstract: Transfer learning has revolutionized fields including natural language understanding and computer vision by leveraging large‐scale general datasets to pretrain models with foundational knowledge that can then be transferred to improve predictions in a vast range of downstream tasks. More recently, there has been a growth in the adoption of transfer learning approaches in biological fields, where models have been pretrained on massive amounts of biological data and employed to make predictions in a broad range of biological applications. However, unlike in natural language where humans are best suited to evaluate models given a clear understanding of the ground truth, biology presents the unique challenge of being in a setting where there are a plethora of unknowns while at the same time needing to abide by real‐world physical constraints. This perspective provides a discussion of some key points we should consider as a field in designing benchmarks for foundation models in network biology. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Modelling flame-to-fuel heat transfer by deep learning and fire images.

Author: Caiyi Xiong, Zilong Wang, and Xinyan Huang
Subjects: *THERMOGRAPHY, *HEAT transfer coefficient, *HEAT transfer, *LIQUID fuels, *HEAT flux
Abstract: In numerical fire simulations, the calculation of thermal feedback from the flame to the solid and liquid fuel surface plays a critical role as it connects the fundamental gas-phase flame burning and condensed-phase fuel gasification. However, it is a computationally intensive task in CFD fire modelling methods because of the requirement of a high-resolution grid for calculating the interface heat transfer. This paper proposed a real-time prediction of the flame-to-fuel heat transfer by using simulated flame images and a computer-vision deep learning method. Different methanol pool fires were selected to produce the image database for training the model. As the pool diameters increase from 20 to 40 cm, the dominant flame-to-fuel heat transfer shifts from convection to radiation. Results show that the proposed AI algorithm trained by flame images can predict both the convective and radiative heat flux distributions on the condensed fuel surface with a relative error below 20%, based on the input of real-time flame morphology that can be captured by a larger grid size. Regardless of growing or decaying fires or puffing flames induced by buoyancy, this method can further predict the non-uniform distribution of heat transfer coefficient on the interface rather than using empirical correlations. This work demonstrates the use of AI and computer vision in accelerating numerical fire simulation, which helps simulate complex fire behaviours with simpler models and smaller computational costs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. A cyber-physical production system for autonomous part quality control in polymer additive manufacturing material extrusion process.

Author: Castillo, Miguel, Monroy, Roberto, and Ahmad, Rafiq
Subjects: COMPUTER vision, ARTIFICIAL intelligence, MACHINE learning, EXTRUSION process, CYBER physical systems
Abstract: This paper introduces a successful implementation of a Cyber-Physical Production System (CPPS) for large-format 3D printing, employing the 5C framework and Internet of Things (IoT) technology. The CPPS focuses on achieving autonomous part quality control by monitoring three critical categories: the thermal behavior of the material during printing deposition, faulty detection of contour's parts being produced, and machine integrity based on component performance. This study reveals that current temperature data on 3D printers does not accurately reflect the physical part deposition temperature by an average offset of 30%. Real-time thermal readings demonstrate potential for accurate monitoring and control of the printing process. Tests validate the CPPS's efficacy in detecting faults in real-time, significantly enhancing overall part quality production by an accuracy detection of 99.7%. Integration of different cameras, image processing, and machine learning algorithms facilitates fault detection and self-awareness of printed parts, providing insights into the mechanical condition of the printer. The combination of machine learning and image processing reduces the need for continuous operator intervention, optimizing production processes and minimizing losses. In conclusion, the implemented CPPS offers a robust solution for achieving autonomous part quality control in large-format 3D printing, showcasing advancements in real-time monitoring, fault detection, and overall improvement in the additive manufacturing process for large scale production implementation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Machine vision-based recognition of elastic abrasive tool wear and its influence on machining performance.

Author: Guo, Lei, Duan, Zhengcong, Guo, Wanjin, Ding, Kai, Lee, Chul-Hee, and Chan, Felix T. S.
Subjects: FRETTING corrosion, COMPUTER vision, MACHINE performance, IMAGE segmentation, ELECTRIC machines, ABRASIVE machining
Abstract: This study presents a novel Hunter-Prey Optimization (HPO)-optimized Otsu algorithm in tool wear assessment and machining process quality control. The algorithm is explicitly tailored to address the challenges conventional image recognition methods face when identifying the unique wear patterns of elastic matrix abrasive tools. The proposed HPO-optimized Otsu algorithm was validated through machining experiments on silicon carbide workpieces, demonstrating superior performance in wear identification, image segmentation, and operational efficiency when compared to both the conventional 2-Dimensional (2D) Otsu algorithm and the Genetic Algorithm (GA)-optimized Otsu algorithm. Notably, the proposed algorithm reduced the average runtime by 36.99% and 28.39%, and decreased the mean squared error by 24.78% and 20.52%, compared to the 2D Otsu and GA-optimized Otsu algorithms, respectively. Additionally, this study investigates the influence of elastic tool wear on abrasive machining performance, offering valuable insights for assessing tool status and life expectancy, and predicting machining quality. The high level of automation, accuracy, and fast execution speed of the proposed algorithm makes it an attractive option for wear identification, with potential applications extending beyond the manufacturing industry to any sector that requires automated image analysis. Consequently, this study contributes to both the theoretical comprehension and practical application of tool wear assessment, providing significant benefits to industries striving for enhanced production efficiency and product quality. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. DeepCervix: enhancing cervical cancer detection through transfer learning with VGG-16 architecture.

Author: Joshi, Vaishali M., Dandavate, Prajkta P., Rashmi, R., Shinde, Gitanjali R., Thune, Neeta N., and Mirajkar, Riddhi
Subjects: MACHINE learning, CERVICAL cancer diagnosis, COMPUTER vision, CERVICAL cancer, COMPUTER-assisted image analysis (Medicine), DEEP learning
Abstract: Cervical cancer remains a significant global health concern, emphasizing the urgent need for improved detection methods to ensure timely treatment. This research introduces a sophisticated methodology leveraging recent advances in medical imaging and deep learning algorithms to enhance the accuracy and efficiency of cervical cancer detection. The proposed approach comprises meticulous data preprocessing to ensure the integrity of input images, followed by the training of deep learning models including ResNet-50, AlexNet, and VGG-16, renowned for their performance in computer vision tasks. Evaluation metrics such as accuracy, precision, recall, and F1-score demonstrate the efficacy of the methodology, with an outstanding accuracy rate of 98% achieved. The model's proficiency in accurately distinguishing healthy cervical tissue from cancerous tissue is underscored by precision, recall, and F1-score values. The primary strength of this deep learning-based approach lies in its potential for early detection, promising significant impact on cervical cancer diagnosis and treatment outcomes. This methodology contributes to advancements in medical imaging techniques, facilitating improved outcomes in cervical cancer detection and treatment. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Fusion algorithms on identifying vacant parking spots using vision-based approach.

Author: Adi, Ginanjar Suwasono, Hertog Nugroho, Rahmatullah, Griffani Megiyanto, Fadhlan, Muhammad Yusuf, and Mutamaddin, Dinan
Subjects: OBJECT recognition (Computer vision), EUCLIDEAN algorithm, COMPUTER vision, PARK management, TRAFFIC congestion, PARKING facilities, INTELLIGENT transportation systems
Abstract: In densely populated cities, parking space scarcity results in issues like traffic congestion and difficulty finding parking spots. Recent advancements in computer vision have introduced methods to address parking lot management challenges. The availability of public image datasets and rapid growth in deep learning technology has led to vision-based parking management studies, offering advantages over sensor-based systems in comprehensive area coverage, cost reduction, and additional functionalities. This study presents an innovative fusion algorithm that integrates object detection with occupancy state algorithms to accurately identify vacant parking spaces. The employment of the YOLOv7 framework for vehicle instance segmentation, combined with three occupancy algorithms Euclidean distance (ED), intersection over reference (IoR), and intersection over union (IoU) are compared to determine the occupancy state of observed areas. The proposed method is evaluated using the CNRPark-EXT dataset, and its performance is compared with state-of-the-art methods. As a result, the proposed approach demonstrates robustness under varying conditions. It outperforms existing methods in terms of system evaluation performance, achieving accuracies of 98.88%, 97.99%, and 90.04% for ED, IoR, and IoU, respectively. This fusion detection method enhances adaptability and addresses occlusions, emphasizing YOLOv7's advantages and accurate shape approximation for slot annotation. This study contributes valuable insights for effective parking management systems and has potential usage in the real-world implementation of intelligent transportation systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Morphometric analysis of wild potato leaves.

Author: Diaz-Garcia, Gabriela, Lozoya-Saldaña, Hector, Bamberg, John, and Diaz-Garcia, Luis
Abstract: To catalog and promote the conservation and use of crop wild relatives, comprehensive phenotypic and genotypic information must be available. Plant genotyping has the power to resolve the phylogenetic relationships between crop wild relatives, quantify genetic diversity, and identify marker-trait associations for expedited molecular breeding. However, access to cost-effective genotyping strategies is often limited in underutilized crops and crop wild relatives. Potato landraces and wild species, distributed throughout Central and South America, exhibit remarkable phenotypic diversity and are an invaluable source of resistance to pests and pathogens. Unfortunately, very limited information is available for these germplasm resources, particularly regarding phenotypic diversity and potential use as trait donors. In this work, more than 150 accessions corresponding to 12 species of wild and cultivated potatoes, collected from different sites across the American continent, were analyzed using computer vision and morphometric methods to evaluate leaf size and shape. In total, more than 1100 leaves and leaflets were processed and analyzed for nine traits related to size, shape, and color. The results produced in this study provided a visual depiction of the extensive variability among potato wild species and enabled a precise quantification of leaf phenotypic differences, including shape, color, area, perimeter, length, width, aspect ratio, convexity, and circularity. We also discussed the application and utility of inexpensive but comprehensive morphometric approaches to catalog and study the diversity of crop wild relatives. Finally, this study provided insights for further experimental research looking into the potential role of leaf size and shape variation in plant–insect interactions, agronomic productivity, and adaptation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. nHi-SEGA: n-Hierarchy SEmantic Guided Attention for few-shot learning.

Author: Yuan, Xinpan, Xie, Shaojun, Zeng, Zhigao, Li, Changyun, and Wang, Luda
Subjects: LEARNING, COMPUTER vision, COGNITION, PRIOR learning, SEMANTICS
Abstract: Humans excel at learning and recognizing objects, swiftly adapting to new concepts with just a few samples. However, current studies in computer vision on few-shot learning have not yet achieved human performance in integrating prior knowledge during the learning process. Humans utilize a hierarchical structure of object categories based on past experiences to facilitate learning and classification. Therefore, we propose a method named n-Hierarchy SEmantic Guided Attention (nHi-SEGA) that acquires abstract superclasses. This allows the model to associate with and pay attention to different levels of objects utilizing semantics and visual features embedded in the class hierarchy (e.g., house finch-bird-animal, goldfish-fish-animal, rose-flower-plant), resembling human cognition. We constructed an nHi-Tree using WordNet and Glove tools and devised two methods to extract hierarchical semantic features, which were then fused with visual features to improve sample feature prototypes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Automation Inspection in Metal Fabrication: Enhancing Sheet Metal Forming Processes with Automation, Machine Vision and Image Correlation for Quality Assurance.

Author: Kumar, S. Pratheesh and Balasubramanian, N.
Subjects: METAL fabrication, COMPUTER vision, DIGITAL image correlation, METALWORK, SHEET metal
Abstract: This review emphasizes the evolving need for automated inspection in metal fabrication processes due to the increasing complexity of design advancements over the years. The study explores various defect detection algorithms and evaluates their effectiveness in enhancing the accuracy and reliability of the inspection process. Machine vision plays a crucial role in this context, contributing significantly to the precision of the inspection process in metal fabrication. Its ability to handle complex tasks ensures a thorough assessment of manufactured components. The paper also explores the use of digital image correlation (DIC) as a key tool in quality assurance for metal fabricated products. This technique provides detailed insights, enabling a thorough understanding of structural integrity and defect identification. By integrating insights on automated inspection through defect detection algorithms, machine vision and DIC, this review aims to advance quality assurance methodologies in the ever-evolving field of metal fabrication. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Advanced facial recognition with LBP-URIGL hybrid descriptors.

Author: Hendi, Sajjad H., Taher, Hazeem B., and Hussein, Karim Q.
Subjects: ARTIFICIAL neural networks, FISHER discriminant analysis, HUMAN facial recognition software, COMPUTER vision, MACHINE learning
Abstract: Facial recognition technology is transformative in security and human-machine interaction, reshaping societal interactions. Robust descriptors, essential for high precision in machine learning tasks like recognition and recall, are integral to this transformation. This paper presents a hybrid model enhancing local binary pattern descriptors for facial representation. By integrating rotation-invariant local binary pattern with uniform rotation-invariant grey-level co-occurrence, employing linear discriminant analysis for feature space optimization, and utilizing an artificial neural network for classification, the model achieves exceptional accuracy rates of 100% for Olivetti Research Laboratory, 99.98% for Maastricht University Computer Vision Test, and 99.17% for Extended Yale B, surpassing traditional methods significantly. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Improving the transferability of adversarial examples with path tuning.

Author: Li, Tianyu, Li, Xiaoyu, Ke, Wuping, Tian, Xuwei, Zheng, Desheng, and Lu, Chao
Subjects: ARTIFICIAL neural networks, IRREGULAR sampling (Signal processing), COMPUTER vision, CYBERTERRORISM, ARTIFICIAL intelligence
Abstract: Adversarial attacks pose a significant threat to real-world applications based on deep neural networks (DNNs), especially in security-critical applications. Research has shown that adversarial examples (AEs) generated on a surrogate model can also succeed on a target model, which is known as transferability. Feature-level transfer-based attacks improve the transferability of AEs by disrupting intermediate features. They target the intermediate layer of the model and use feature importance metrics to find these features. However, current methods overfit feature importance metrics to surrogate models, which results in poor sharing of the importance metrics across models and insufficient destruction of deep features. This work demonstrates the trade-off between feature importance metrics and feature corruption generalization, and categorizes feature destructive causes of misclassification. This work proposes a generative framework named PTNAA to guide the destruction of deep features across models, thus improving the transferability of AEs. Specifically, the method introduces path methods into integrated gradients. It selects path functions using only a priori knowledge and approximates neuron attribution using nonuniform sampling. In addition, it measures neurons based on the attribution results and performs feature-level attacks to remove inherent features of the image. Extensive experiments demonstrate the effectiveness of the proposed method. The code is available at https://github.com/lounwb/PTNAA. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Adaptive multimodal prompt for human-object interaction with local feature enhanced transformer.

Author: Xue, Kejun, Gao, Yongbin, Fang, Zhijun, Jiang, Xiaoyan, Yu, Wenjun, Chen, Mingxuan, and Wu, Chenmou
Subjects: TRANSFORMER models, COMPUTER vision, FEATURE extraction, LEARNING strategies, DATA distribution
Abstract: Human-object interaction (HOI) detection is an important computer vision task for recognizing the interaction between humans and surrounding objects in an image or video. The HOI datasets have a serious long-tailed data distribution problem because it is challenging to have a dataset that contains all potential interactions. Many HOI detectors have addressed this issue by utilizing visual-language models. However, due to the calculation mechanism of the Transformer, the visual-language model is not good at extracting the local features of input samples. Therefore, we propose a novel local feature enhanced Transformer to motivate encoders to extract multi-modal features that contain more information. Moreover, it is worth noting that the application of prompt learning in HOI detection is still in preliminary stages. Consequently, we propose a multi-modal adaptive prompt module, which uses an adaptive learning strategy to facilitate the interaction of language and visual prompts. In the HICO-DET and SWIG-HOI datasets, the proposed model achieves full interaction with 24.21% mAP and 14.29% mAP, respectively. Our code is available at https://github.com/small-code-cat/AMP-HOI. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. NN-VVC: A Hybrid Learned-Conventional Video Codec Targeting Humans and Machines.

Author: Ahonen, Jukka I., Le, Nam, Zhang, Honglei, Hallapuro, Antti, Cricri, Francesco, Tavakoli, Hamed Rezazadegan, Hannuksela, Miska M., and Rahtu, Esa
Subjects: COMPUTER vision, SOFTWARE compatibility, VIDEO codecs, ARTIFICIAL intelligence, HYBRID computers (Computer architecture), VIDEO coding
Abstract: Advancements in artificial intelligence have significantly increased the use of images and videos in machine analysis algorithms, predominantly neural networks. However, the traditional methods of compressing, storing and transmitting media have been optimized for human viewers rather than machines. Current research in coding images and videos for machine analysis has evolved in two distinct paths. The first is characterized by End-to-End (E2E) learned codes, which show promising results in image coding but have yet to match the performance of leading Conventional Video Codecs (CVC) and suffer from a lack of interoperability. The second path optimizes CVC, such as the Versatile Video Coding (VVC) standard, for machine-oriented reconstruction. Although CVC-based approaches enjoy widespread hardware and software compatibility and interoperability, they often fall short in machine task performance, especially at lower bitrates. This paper proposes a novel hybrid codec for machines named NN-VVC, which combines the advantages of an E2E-learned image codec and a CVC to achieve high performance in both image and video coding for machines. Our experiments show that the proposed system achieved up to − 43.20% and − 26.8% Bjøntegaard Delta rate reduction over VVC for image and video data, respectively, when evaluated on multiple different datasets and machine vision tasks according to the common test conditions designed by the VCM study group in MPEG standardization activities. Furthermore, to improve reconstruction quality, we introduce a human-focused branch into our codec, enhancing the visual appeal of reconstructions intended for human supervision of the machine-oriented main branch. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Hetero-associative Memory Based New Iraqi License Plate Recognition.

Author: Hasan, Rusul Hussein, Aboud, Inaam Salman, Hassoon, Rasha Majid, and aldeen Aubaid Khioon, Ali saif
Subjects: BIDIRECTIONAL associative memories (Computer science), AUTOMOBILE license plates, INTELLIGENT transportation systems, COMPUTER vision, VISUAL fields, DIGITAL image processing, PATTERN recognition systems
Abstract: Copyright of Baghdad Science Journal is the property of Republic of Iraq Ministry of Higher Education & Scientific Research (MOHESR) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

36. Automatically Guided Vehicles (AGV) in Agriculture.

Author: Zimmer, Domagoj, Šumanovac, Luka, Jurišić, Mladen, Čosić, Arian, and Lucić, Pavo
Subjects: ROBOTICS, COMPUTER vision, CLEAN energy, ANTENNAS (Electronics), WEB-based user interfaces, AUTOMOTIVE navigation systems
Abstract: In this paper, new types of autonomous systems used in agriculture were analysed. The paper shows new self-guiding systems such as AGVs with full autonomy in degrees operation. It explains internal transport and systems of autonomous vehicles in outdoor agriculture. New autonomous systems used outside such as appliance of special navigation systems and their purpose in agriculture are present in this work. Navigation systems with GPS signal and RTK technology, vehicle guidance camera and AI machine vision for manipulation are described. Light and laser technologies for fully autonomous robotic technologies such as LiDAR system in vehicle for detection of the presence of pests and diseases are presented in this paper. The paper emphasized advantages of using AGVs as result of their autonomy, clean power sources without harmful impact on the environment. Navigation in indoor spaces that uses LTE Direct protocol is explained, whereby the Wi-Fi ceiling antenna and wireless APP for horizontal movement of AGVs is shown. The ways of using UAVs for warehouse inventory through web applications with an advanced navigation system guided by AI are given in this work. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Intelligent Crack Detection in Infrastructure Using Computer Vision at the Edge.

Author: Rizia, Mst. Mousumi, Reyes‐Munoz, Julio A., Ortega, Angel G., Choudhuri, Ahsan, and Flores‐Abad, Angel
Abstract: ABSTRACT To fulfil the demands of the industry in autonomous intelligent inspection, innovative frameworks that allow Convolutional Neural Networks to run at the edge in real‐time are required. This paper proposes an end‐to‐end approach and system to enable crack detection onboard a customised embedded system. In order to make possible the deployment and execution on edge, this work develops a dataset by combining new and existing images, it introduces a quantization approach that includes inference optimization, memory reuse, and freezing layers. Real‐time, onsite results from aerial and hand‐held setup images of industrial environments show that the system is capable of identifying and localiszing cracks within the field of view of the camera with a mean average precision (mAP) of 98.44% and at ~2.5 frames per second with real‐time inference. Therefore, it is evidenced that, despite using a full model, the introduced model customization improved the mAP by ~8% with respect to lighter state‐of‐the‐art models, and the quantization technique led to a model inference two times faster. The proposed intelligent and autonomous approach advances common offline inspection techniques to enable on‐site, artificial intelligence‐based inspection systems, which also aid in reducing human errors and enhance safety conditions by automatically performing defect‐recognition in tight and difficult‐to‐reach spots. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Enhanced blur‐robust monocular depth estimation via self‐supervised learning.

Author: Sung, Chi‐Hun, Kim, Seong‐Yeol, Shin, Ho‐Ju, Lee, Se‐Ho, and Kim, Seung‐Wook
Abstract: This letter presents a novel self‐supervised learning strategy to improve the robustness of a monocular depth estimation (MDE) network against motion blur. Motion blur, a common problem in real‐world applications like autonomous driving and scene reconstruction, often hinders accurate depth perception. Conventional MDE methods are effective under controlled conditions but struggle to generalise their performance to blurred images. To address this problem, we generate blur‐synthesised data to train a robust MDE model without the need for preprocessing, such as deblurring. By incorporating self‐distillation techniques and using blur‐synthesised data, the depth estimation accuracy for blurred images is significantly enhanced without additional computational or memory overhead. Extensive experimental results demonstrate the effectiveness of the proposed method, enhancing existing MDE models to accurately estimate depth information across various blur conditions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Unified diffusion‐based object detection in multi‐modal and low‐light remote sensing images.

Author: Sun, Xu, Yu, Yinhui, and Cheng, Qing
Abstract: Remote sensing object detection remains a challenge under complex conditions such as low light, adverse weather, modality attacks or losses. Previous approaches typically alleviate this problem by enhancing visible images or leveraging multi‐modal fusion technologies. In view of this, the authors propose a unified framework based on YOLO‐World that combines the advantages of both schemes, achieving more adaptable and robust remote sensing object detection in complex real‐world scenarios. This framework introduces a unified modality modelling strategy, allowing the model to learn abundant object features from multiple remote sensing datasets. Additionally, a U‐fusion neck based on the diffusion method is designed to effectively remove modality‐specific noise and generate missing complementary features. Extensive experiments were conducted on four remote sensing image datasets: Multimodal VEDAI, DroneVehicle, unimodal VisDrone and UAVDT. This approach achieves average precision scores of 50.5%$\%$, 55.3%$\%$, 25.1%$\%$, and 20.7%$\%$, which outperforms advanced multimodal remote sensing object detection methods and low‐light image enhancement techniques. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Deep Learning-Based Student Engagement Classification in Online Learning.

Author: Mandia, Sandeep, Singh, Kuldeep, and Mitharwal, Rajendra
Abstract: Online education has gained significant popularity during the COVID-19 pandemic. The availability of massive open online courses has further strengthened the online learning environment. One of the considerable challenges in online learning environments is to measure the learner’s engagement to meet the educational objectives. This paper addresses the challenge and proposes a convolutional neural network-based architecture that extracts discriminative features from the face (Affective) and the upper body (Behavioral) of the learner and classifies the engagement. The performance of the proposed architecture is evaluated on the publicly available Dataset of Affective States In E-Environments (DAiSEE) and the learning-centered affective state dataset curated from open-source datasets. The experimental results demonstrate that the proposed methodology with affective and behavioral features improves the engagement measurement results. The proposed method outperforms the state-of-the-art in terms of Unweighted Average Recall (UAR), Unweighted Average Precision (UAP), and Unweighted Average F1 (UAF1) with 79.14%, 49.62%, and 55.29% values, respectively. It is threefold more computationally efficient than the previous state-of-the-art method. The proposed method also improves the accuracy of the curated dataset by 2.07% of the prior method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Visual political communication on Instagram: a comparative study of Brazilian presidential elections.

Author: de-Lima-Santos, Mathias-Felipe, Gonçalves, Isabella, Quiles, Marcos G., Mesquita, Lucia, Ceron, Wilson, and Couto Lorena, Maria Clara
Abstract: In today's digital age, images have become powerful tools for politicians to engage with their voters on social media platforms. Visual content possesses a unique emotional appeal that often leads to increased user engagement. However, research on visual communication remains relatively limited, particularly in the Global South. This study aims to bridge this gap by employing a combination of computational methods and qualitative approach to investigate the visual communication strategies employed in a dataset of 11,263 Instagram posts by 19 Brazilian presidential candidates in 2018 and 2022 national elections. Through two studies, we observed consistent patterns across these candidates on their use of visual political communication. Notably, we identify a prevalence of celebratory and positively toned images. They also exhibit a strong sense of personalization, portraying candidates connected with their voters on a more emotional level. Our research also uncovers unique contextual nuances specific to the Brazilian political landscape. We note a substantial presence of screenshots from news websites and other social media platforms. Furthermore, text-edited images with portrayals emerge as a prominent feature. In light of these results, we engage in a discussion regarding the implications for the broader field of visual political communication. This article contributes by showing the ways Instagram was used in the digital political strategy of two fiercely polarized Brazilian elections, shedding light on the ever-evolving dynamics of visual political communication in the digital age. Finally, we propose avenues for future research in the field of political communication. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. The role of artificial intelligence (AI) and Chatgpt in water resources, including its potential benefits and associated challenges.

Author: Haider, Saif, Rashid, Muhammad, Tariq, Muhammad Atiq Ur Rehman, and Nadeem, Abdullah
Abstract: Artificial Intelligence (AI), including models like ChatGPT, is transforming water resources management by improving hydrological modeling, water quality assessment, and flood prediction. AI techniques such as Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs) have enhanced streamflow predictions and groundwater management, particularly in data-scarce regions. AI-powered systems like Smart Microclimate Control Systems (SMCS) optimize agricultural practices, leading to better resource conservation and higher crop yields. However, the scalability and applicability of AI across diverse environments pose challenges, especially where data is limited. The success of AI models depends on data quality, requiring ongoing interdisciplinary research to refine these technologies for real-world use. Additionally, tools like ChatGPT, while valuable for knowledge dissemination and data analysis, raise concerns about accuracy in critical decision-making contexts. In conclusion, while AI offers significant potential for improving water resources management, addressing challenges related to data quality, model scalability, and interdisciplinary collaboration is essential for achieving sustainable and effective outcomes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Automatic Merging Method for Sectional Map Based on Deep Learning.

Author: Shifan Liu, Chen Xing, Chengwei Dong, Yunhan Li, and Peirun Cao
Subjects: DEEP learning, GRIDS (Cartography), MAP design, IMAGE registration, NATURAL resources, COMPUTER vision
Abstract: Owing to time and scene constraints, a significant number of sectional maps exist in paper form. These maps contain a vast amount of data and hold high information value. However, they often suffer from issues such as annotations, stains, deformation, and missing content during preservation. Traditional processing methods require a large amount of manual image registration, which is extremely inconvenient. In this study, a map image labeling program is designed using OpenCV to prepare a map image dataset, and the U2Net-p algorithm for map segmentation is trained on this dataset. Furthermore, a comprehensive method for automatically merging sectional maps is designed and implemented, which can repair and process sectional maps and seamlessly integrate them into target grids according to map sheet numbering rules. This method has been applied to the production of base maps for natural resource demarcation projects, achieving a stitching accuracy of 96.67% on marked anchor points and considerably improving processing speed. This indicates that our approach has broad application value in the field of automatic stitching and fusion of sectional map images. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. SATEER: Subject-Aware Transformer for EEG-Based Emotion Recognition.

Author: Lanzino, Romeo, Avola, Danilo, Fontana, Federico, Cinque, Luigi, Scarcello, Francesco, and Foresti, Gian Luca
Subjects: *EMOTION recognition, *COMPUTER vision, *DEEP learning, *TRANSFORMER models, *PERSONALITY
Abstract: This study presents a Subject-Aware Transformer-based neural network designed for the Electroencephalogram (EEG) Emotion Recognition task (SATEER), which entails the analysis of EEG signals to classify and interpret human emotional states. SATEER processes the EEG waveforms by transforming them into Mel spectrograms, which can be seen as particular cases of images with the number of channels equal to the number of electrodes used during the recording process; this type of data can thus be processed using a Computer Vision pipeline. Distinct from preceding approaches, this model addresses the variability in individual responses to identical stimuli by incorporating a User Embedder module. This module enables the association of individual profiles with their EEGs, thereby enhancing classification accuracy. The efficacy of the model was rigorously evaluated using four publicly available datasets, demonstrating superior performance over existing methods in all conducted benchmarks. For instance, on the AMIGOS dataset (A dataset for Multimodal research of affect, personality traits, and mood on Individuals and GrOupS), SATEER’s accuracy exceeds 99.8% accuracy across all labels and showcases an improvement of 0.47% over the state of the art. Furthermore, an exhaustive ablation study underscores the pivotal role of the User Embedder module and each other component of the presented model in achieving these advancements. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Smartphone‐based high durable strain sensor with sub‐pixel‐level accuracy and adjustable camera position.

Author: Wu, Pengfei, Lu, Bo, Li, Huan, Li, Weijie, and Zhao, Xuefeng
Subjects: *IMAGE sensors, *STRAIN sensors, *MEASUREMENT errors, *COMPUTER vision, *DEFORMATION of surfaces
Abstract: Computer vision strain sensors typically require the camera position to be fixed, limiting measurements to surface deformations of structures at pixel‐level resolution. Also, sensors have a service term significantly shorter than the designed service term of the structures. This paper presents research on a high durable computer vision sensor, microimage strain sensing (MISS)‐Silica, which utilizes a smartphone connected to an endoscope for measurement. It is designed with a range of 0.05 ε, enabling full‐stage strain measurement from loading to failure of structures. The sensor does not require the camera to be fixed during measurements, laying the theoretical foundation for embedded computer vision sensors. Measurement accuracy is improved from pixel level to sub‐pixel level, with pixel‐based measurement errors around 8 µε (standard deviation approximately 7 µε) and sub‐pixel calculation errors around 6 µε (standard deviation approximately 5 µε). Sub‐pixel calculation has approximately 30% enhancement in measurement accuracy and stability. MISS‐Silica features easy data acquisition, high precision, and long service term, offering a promising method for long‐term measurement of both surface and internal structures. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. Intelligent Computing for Crop Monitoring in CIoT: Leveraging AI and Big Data Technologies.

Author: Ahmed, Imran, Ahmad, Misbah, Ghazouani, Haythem, Barhoumi, Walid, and Jeon, Gwanggil
Subjects: *ARTIFICIAL intelligence, *DATA privacy, *COMPUTER vision, *AGRICULTURAL resources, *CROP management, *DEEP learning
Abstract: ABSTRACT Consumer Internet of Things (CIoT) has revolutionised agriculture by integrating intelligent computing, artificial intelligence and big data technologies in crop monitoring. This paper explores the application of intelligent computing and deep learning methodologies in crop monitoring within the CIoT framework. In CIoT‐based crop monitoring, a vision sensor collects real‐time data from crop leaf images. The image dataset is processed using state‐of‐the‐art deep learning models and intelligent computing algorithms. This integration enables the early detection of crop diseases by leveraging computer vision and deep learning. Intelligent computing systems provide accurate disease classification, real‐time alerts, and actionable recommendations for optimised crop management practises. This advanced system empowers farmers to make data‐driven decisions, such as irrigation optimization, targeted pesticide application and nutrient supplementation, to maximise crop productivity and minimise losses. A benchmark dataset of leaf images is used, and a deep learning based model is presented for classifying healthy and diseased leaves. Experimental results demonstrate an accuracy rate of 0.98, with detailed validation, including dataset size and model parameters. Key benefits of intelligent computing in CIoT‐based crop monitoring include enhanced resource efficiency, reduced environmental impact, and improved sustainability. The paper also addresses the challenges of implementing AI and big data technologies, such as data privacy, security, interoperability and resource management in agricultural settings. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Prediction of Salicornia europaea L. biomass using a computer vision system to distinguish different salt-tolerant populations.

Author: Cárdenas-Pérez, S., Grigore, M. N., and Piernik, A.
Subjects: *PLANT biomass, *DISCRIMINANT analysis, *COMPUTER vision, *BIOMASS production, *PEARSON correlation (Statistics)
Abstract: Background: Salicornia europaea L. is emerging as a versatile crop halophyte, requiring a low-cost, non-destructive method for salt tolerance classification to aid selective breeding. We propose using a computer vision system (CVS) with multivariate analysis to classify S. europaea based on morphometric and colour traits to predict plant biomass and the salinity in their substrate. Results: A trial and validation set of 96 and 24 plants from 2 populations confirmed the efficacy. CVS and multivariate analysis evaluated the plants by morphometric traits and CIELab colour variability. Through Pearson analysis, the strongest correlations were between biomass fresh weight (FW) vs. projected area (PA) (0.91) and anatomical cross-section (ACS) vs. shoot diameter (Sd) (0.94). The PA and FW correlation retrieved different equation fits between lower and higher salt-tolerant populations (R2 = 0.93 for linear and 0.90 for 2nd-degree polynomial), respectively. The higher salt-tolerant reached a maximum biomass PA at 400 mM NaCl, while the lower salt-tolerant produced less under 200 and 400 mM. A second Pearson correlation and PCA described sample variability with 80% reliability using only morphometric-colour parameters. Multivariate discriminant analysis (MDA) demonstrated that the method correctly classifies plants (90%) depending on their salinity level and tolerance, which was validated with 100% effectiveness. Through multiple linear regression, a predictive model successfully estimated biomass production by PA, and a second model predicted the salinity substrate (Sal.s.) where the plants thrive. Plants' Sd and height influenced PA prediction, while Sd and colour difference (ΔE1) influenced Sal.s. Models validation of actual vs. predicted values showed a R2 of 0.97 and 0.90 for PA, and 0.95 and 0.97 for Sal.s. for lower and higher salt-tolerant, respectively. This outcome confirms the method as a cost-effective tool for managing S. europaea breeding. Conclusions: The CVS effectively extracted morphological and colour features from S. europaea cultivated at different salinity levels, enabling classification and plant sorting through image and multivariate analysis. Biomass and salinity substrate were accurately predicted by modelling non-destructive parameters. Enhanced by AI, machine learning and smartphone technology, this method shows great potential in ecology, bio-agriculture, and industry. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Enhanced breast cancer diagnosis through integration of computer vision with fusion based joint transfer learning using multi modality medical images.

Author: Iniyan, S., Raja, M. Senthil, Poonguzhali, R., Vikram, A., Ramesh, Janjhyam Venkata Naga, Mohanty, Sachi Nandan, and Dudekula, Khasim Vali
Subjects: *OPTIMIZATION algorithms, *COMPUTER vision, *CANCER diagnosis, *GABOR filters, *DIAGNOSTIC imaging, *BREAST
Abstract: Breast cancer (BC) is a type of cancer which progresses and spreads from breast tissues and gradually exceeds the entire body; this kind of cancer originates in both sexes. Prompt recognition of this disorder is most significant in this phase, and it is measured by providing patients with the essential treatment so their efficient lifetime can be protected. Scientists and researchers in numerous studies have initiated techniques to identify tumours in early phases. Still, misperception in classifying skeptical lesions can be due to poor image excellence and dissimilar breast density. BC is a primary health concern, requiring constant initial detection and improvement in analysis. BC analysis has made major progress recently with combining multi-modal image modalities. These studies deliver an overview of the segmentation, classification, or grading of numerous cancer types, including BC, by employing conventional machine learning (ML) models over hand-engineered features. Therefore, this study uses multi-modality medical imaging to propose a Computer Vision with Fusion Joint Transfer Learning for Breast Cancer Diagnosis (CVFBJTL-BCD) technique. The presented CVFBJTL-BCD technique utilizes feature fusion and DL models to effectively detect and identify BC diagnoses. The CVFBJTL-BCD technique primarily employs the Gabor filtering (GF) technique for noise removal. Next, the CVFBJTL-BCD technique uses a fusion-based joint transfer learning (TL) process comprising three models, namely DenseNet201, InceptionV3, and MobileNetV2. The stacked autoencoders (SAE) model is implemented to classify BC diagnosis. Finally, the horse herd optimization algorithm (HHOA) model is utilized to select parameters involved in the SAE method optimally. To demonstrate the improved results of the CVFBJTL-BCD methodology, a comprehensive series of experimentations are performed on two benchmark datasets. The comparative analysis of the CVFBJTL-BCD technique portrayed a superior accuracy value of 98.18% and 99.15% over existing methods under Histopathological and Ultrasound datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Image-processing-based model for surface roughness evaluation in titanium based alloys using dual tree complex wavelet transform and radial basis function neural networks.

Author: Vishwanatha, J. S., Srinivasa Pai, P., D'Mello, Grynal, Sampath Kumar, L., Bairy, Raghavendra, Nagaral, Madeva, Channa Keshava Naik, N., Lamani, Venkatesh T., Chandrashekar, A., Yunus Khan, T. M., Almakayeel, Naif, and Ahmad Khan, Wahaj
Subjects: *RADIAL basis functions, *COMPUTER vision, *FEATURE extraction, *PARTICLE swarm optimization, *WAVELET transforms
Abstract: In this study, we examine the assessment of surface roughness on turned surfaces of Ti 6Al 4V using a computer vision system. We utilize the Dual-Tree Complex Wavelet Transform (DTCWT) to break down the images of the turned surface into sub-images oriented in directions. Three different methods of feature generation have been compared, i.e., the use of Gray-Level Co-Occurrence Matrix (GLCM) and DTCWT-based extraction of second-order statistical features, DTCWT Image fusion, and the use of GLCM for feature extraction, and DTCWT image fusion using Particle Swarm Optimization (PSO) based GLCM features. Principal Component Analysis (PCA) was utilized to identify and select features. The model was developed using a Radial Basis Function Neural Network (RBFNN). Accordingly, six models were designed based on the three feature generation methods, considering all features and features selected using PCA. The RBFNN model, which incorporates DTCWT Image fusion and utilizes PSO with PCA features, achieved a training data prediction accuracy of 100% and a test data prediction accuracy of 99.13%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Information disclosure and funding success of green crowdfunding campaigns: a study on GoFundMe.

Author: Yin, Ziyi, Huang, Guowei, Zhao, Rui, Wang, Sen, Shang, Wen-Long, Han, Chunjia, and Yang, Mu
Subjects: PROJECT finance, NATURAL language processing, COMPUTER vision, DISCLOSURE, SUSTAINABLE development, CROWD funding
Abstract: Crowdfunding has become important in increasing financial support for the development of green technologies. Self-disclosed information significantly affects supporters' decisions and is important for the success of green project funding. However, current studies still lack investigations into the impact of information disclosure on green crowdfunding performance. This research aims to fill this knowledge gap by exploring eight information disclosure-relevant factors in green crowdfunding performance. Applying machine learning techniques (e.g., Natural Language Processing and Computer Vision) and logistic regression, this study investigates 720 green crowdfunding campaigns on GoFundMe and empirically finds that the duration, length of campaign introductions, and length of the title influence fundraising outcomes. However, no evidence supports the impact of goal size, emotion of campaign introduction, or image content on funding success. This study clarifies the information disclosure-related data that green crowdfunding campaigns should consider and provides founders with a constructive guide to smoothly raise money for a green crowdfunding campaign. This study also contributes to data processing methods by providing future studies with an approach for transferring unstructured data to structured data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

262,009 results on '"Computer vision"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources