522 results on '"YOLO"'
Search Results
2. A computer vision system for apple fruit sizing by means of low-cost depth camera and neural network application.
- Author
-
Bortolotti, G., Piani, M., Gullino, M., Mengoli, D., Franceschini, C., Grappadelli, L. Corelli, and Manfrini, L.
- Subjects
- *
ORCHARD management , *APPLE orchards , *COMPUTER vision , *IMAGE analysis , *FARMERS , *ORCHARDS - Abstract
Fruit size is crucial for growers as it influences consumer willingness to buy and the price of the fruit. Fruit size and growth along the seasons are two parameters that can lead to more precise orchard management favoring production sustainability. In this study, a Python-based computer vision system (CVS) for sizing apples directly on the tree was developed to ease fruit sizing tasks. The system is made of a consumer-grade depth camera and was tested at two distances among 17 timings throughout the season, in a Fuji apple orchard. The CVS exploited a specifically trained YOLOv5 detection algorithm, a circle detection algorithm, and a trigonometric approach based on depth information to size the fruits. Comparisons with standard-trained YOLOv5 models and with spherical objects were carried out. The algorithm showed good fruit detection and circle detection performance, with a sizing rate of 92%. Good correlations (r > 0.8) between estimated and actual fruit size were found. The sizing performance showed an overall mean error (mE) and RMSE of + 5.7 mm (9%) and 10 mm (15%). The best results of mE were always found at 1.0 m, compared to 1.5 m. Key factors for the presented methodology were: the fruit detectors customization; the HoughCircle parameters adaptability to object size, camera distance, and color; and the issue of field natural illumination. The study also highlighted the uncertainty of human operators in the reference data collection (5–6%) and the effect of random subsampling on the statistical analysis of fruit size estimation. Despite the high error values, the CVS shows potential for fruit sizing at the orchard scale. Future research will focus on improving and testing the CVS on a large scale, as well as investigating other image analysis methods and the ability to estimate fruit growth. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Comparison of algorithms for the detection of marine vessels with machine vision.
- Author
-
Rodríguez-Gonzales, José, Niquin-Jaimes, Junior, and Paiva-Peredo, Ernesto
- Subjects
CONVOLUTIONAL neural networks ,OBJECT recognition (Computer vision) ,COMPUTER vision ,DIGITAL image processing ,MACHINE learning - Abstract
The detection of marine vessels for revenue control has many tracking deficiencies, which has resulted in losses of logistical resources, time, and money. However, digital cameras are not fully exploited since they capture images to recognize the vessels and give immediate notice to the control center. The analyzed images go through an incredibly detailed process, which, thanks to neural training, allows us to recognize vessels without false positives. To do this, we must understand the behavior of object detection; we must know critical issues such as neural training, image digitization, types of filters, and machine learning, among others. We present results by comparing two development environments with their corresponding algorithms, making the recognition of ships immediately under neural training. In conclusion, it is analyzed based on 100 images to measure the boat detection capability between both algorithms, the response time, and the effectiveness of an image obtained by a digital camera. The result obtained by YOLOv7 was 100% effective under the application of processing techniques based on neural networks in convolutional neural network (CNN) regions compared to MATLAB, which applies processing metrics based on morphological images, obtaining low results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Integrating YOLO and WordNet for automated image object summarization.
- Author
-
Saqib, Sheikh Muhammad, Aftab, Aamir, Mazhar, Tehseen, Iqbal, Muhammad, Shahazad, Tariq, Almogren, Ahmad, and Hamam, Habib
- Abstract
The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Method to Estimate Dislocation Densities from Images of α‐Ga2O3‐Based Corundum Oxides Using the Computer Vision YOLO Algorithm.
- Author
-
Dang, Giang T., Kawaharamura, Toshiyuki, and Allen, Martin W.
- Subjects
- *
SEMICONDUCTOR thin films , *DISLOCATION density , *CONVOLUTIONAL neural networks , *COMPUTER vision , *THIN films - Abstract
This work applies the computer vision “You only look once” (YOLO) algorithm to extract bounding boxes around dislocations in weak‐beam dark‐field transmission electron microscopy (WBDF TEM) images of semiconductor thin films. A formula is derived to relate the sum of the relative heights of the bounding boxes to the dislocation densities in the films. WBDF TEM images reported in the literature and taken from our α‐Ga2O3 samples are divided into train, evaluation, and test datasets. Different models are trained using the train dataset and evaluated using the evaluation dataset to find the best confidence values, which are used to select the best model based on the performance against the test data set. For α‐Ga2O3 thin films, dislocation density output by this model is on average ≈58% of those estimated by the traditional Ham method. A factor of 4/π may contribute to the systematic underestimation of the model versus the Ham method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Deep learning based identification and tracking of railway bogie parts.
- Author
-
Shaikh, Muhammad Zakir, Ahmed, Zeeshan, Baro, Enrique Nava, Hussain, Samreen, and Milanova, Mariofanna
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,RAILROAD safety measures ,DEEP learning ,ROLLING stock ,BOGIES (Vehicles) - Abstract
The Train Rolling-Stock Examination (TRSE) is a safety examination process that physically examines the bogie parts of a moving train, typically at speeds over 30 km/h. Currently, this inspection process is done manually by railway personnel in many countries to ensure safety and prevent interruptions to rail services. Although many earlier attempts have been made to semi-automate this process through computer-vision models, these models are iterative and still require manual intervention. Consequently, these attempts were unsuitable for real-time implementations. In this work, we propose a detection model by utilizing a deep-learning based classifier that can precisely identify bogie parts in real-time without manual intervention, allowing an increase in the deployability of these inspection systems. We implemented the Anchor-Free Yolov8 (AFYv8) model, which has a decoupled-head module for recognizing bogie parts. Additionally, we incorporated bogie parts tracking with the AFYv8 model to gather information about any missing parts. To test the effectiveness of the AFYv8-model, the bogie videos were captured at three different timestamps and the result shows the increase in the recognition accuracy of TRSE by 10 % compared to the previously developed classifiers. This research has the potential to enhance railway safety and minimize operational interruptions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Tailhook Recognition for Carrier-Based Aircraft Based on YOLO with Bi-Level Routing Attention.
- Author
-
Lu, Aiguo, Liu, Pandi, Yang, Jie, Li, Zhe, and Wang, Ke
- Subjects
- *
OBJECT recognition (Computer vision) , *LEARNING ability , *MODEL airplanes , *ROUTING algorithms - Abstract
To address the problems of missed and false detections caused by target occlusion and lighting variations, this paper proposes a recognition model based on YOLOv5 with bi-level routing attention to achieve precise real-time small object recognition, using the problem of tailhook recognition for carrier-based aircraft as a representative application. Firstly, a module called D_C3, which combines deformable convolution, was integrated into the backbone network to enhance the model's learning ability and adaptability in specific scenes. Secondly, a bi-level routing attention mechanism was employed to dynamically focus on the regions of the feature map that are more likely to contain the target, leading to more accurate target localization and classification. Additionally, the loss function was optimized to accelerate the bounding box regression process. The experimental results on the self-constructed CATHR-DET and the public VOC dataset demonstrate that the proposed method outperforms the baselines in overall performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A Systematic Review and Comparative Analysis Approach to Boom Gate Access Using Plate Number Recognition.
- Author
-
Bukola, Asaju Christine, Owolawi, Pius Adewale, Du, Chuling, and Van Wyk, Etienne
- Subjects
OBJECT recognition (Computer vision) ,MACHINE learning ,COMPUTER vision ,ACCESS control ,AUTOMOBILE license plates - Abstract
Security has been paramount to many organizations for many years, with access control being one of the critical measures to ensure security. Among various approaches to access control, vehicle plate number recognition has received wide attention. However, its application to boom gate access has not been adequately explored. This study proposes a method to access the boom gate by optimizing vehicle plate number recognition. Given the speed and accuracy of the YOLO (You Only Look Once) object detection algorithm, this study proposes using the YOLO deep learning algorithm for plate number detection to access a boom gate. To identify the gap and the most suitable YOLO variant, the study systematically surveyed the publication database to identify peer-reviewed articles published between 2020 and 2024 on plate number recognition using different YOLO versions. In addition, experiments are performed on four YOLO versions: YOLOv5, YOLOv7, YOLOv8, and YOLOv9, focusing on vehicle plate number recognition. The experiments, using an open-source dataset with 699 samples in total, reported accuracies of 81%, 82%, 83%, and 73% for YOLO V5, V7, V8, and V9, respectively. This comparative analysis aims to determine the most appropriate YOLO version for the task, optimizing both security and efficiency in boom gate access control systems. By optimizing the capabilities of advanced YOLO algorithms, the proposed method seeks to improve the reliability and effectiveness of access control through precise and rapid plate number recognition. The result of the analysis reveals that each YOLO version has distinct advantages depending on the application's specific requirements. In complex detection conditions with changing lighting and shadows, it was revealed that YOLOv8 performed better in terms of reduced loss rates and increased precision and recall metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. A fast high throughput plant phenotyping system using YOLO and Chan-Vese segmentation.
- Author
-
Jain, S., Ramesh, Dharavath, Damodar Reddy, E., Rathod, Santosha, and Ondrasek, Gabrijel
- Subjects
- *
SUSTAINABILITY , *TECHNOLOGICAL innovations , *COMPUTER vision , *AGRICULTURAL productivity , *ARABIDOPSIS thaliana - Abstract
Understanding plant traits is essential for decoding the behavior of various genomes and their reactions to environmental factors, paving the way for efficient and sustainable agricultural practices. Image-based plant phenotyping has become increasingly popular in modern agricultural research, effectively analyzing large-scale plant data. This study introduces a new high-throughput plant phenotyping system designed to examine plant growth patterns using segmentation analysis. This system consists of two main components: (i) A plant detector module that identifies individual plants within a high-throughput imaging setup, utilizing the Tiny-YOLOv4 (You Only Look Once) architecture. (ii) A segmentation module that accurately outlines the identified plants using the Chan-Vese segmentation algorithm. We tested our approach using top-view RGB tray images of the 'Arabidopsis Thaliana' plant species. The plant detector module achieved an impressive localization accuracy of 96.4% and an average Intersection over Union (IoU) of 77.42%. Additionally, the segmentation module demonstrated strong performance with dice and Jaccard scores of 0.95 and 0.91, respectively. These results highlight the system's capability to define plant boundaries accurately. Our findings affirm the effectiveness of our high-throughput plant phenotyping system and underscore the importance of employing advanced computer vision techniques for precise plant trait analysis. These technological advancements promise to boost agricultural productivity, advance genetic research, and promote environmental sustainability in plant biology and agriculture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. YOLO-based Object Detection Models: A Review and its Applications.
- Author
-
Vijayakumar, Ajantha and Vairavasundaram, Subramaniyaswamy
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,ARCHITECTURAL design ,RESEARCH personnel ,DETECTORS - Abstract
In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. Object detection performance is measured in both detection accuracy and inference time. The detection accuracy in two stage detectors is better than single stage detectors. In 2015, the real-time object detection system YOLO was published, and it rapidly grew its iterations, with the newest release, YOLOv8 in January 2023. The YOLO achieves a high detection accuracy and inference time with single stage detector. Many applications easily adopt YOLO versions due to their high inference speed. This paper presents a complete survey of YOLO versions up to YOLOv8. This article begins with explained about the performance metrics used in object detection, post-processing methods, dataset availability and object detection techniques that are used mostly; then discusses the architectural design of each YOLO version. Finally, the diverse range of YOLO versions was discussed by highlighting their contributions to various applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. YOLOv8-TDD: An Optimized YOLOv8 Algorithm for Targeted Defect Detection in Printed Circuit Boards.
- Author
-
Yunpeng, Gao, Rui, Zhang, Mingxu, Yang, and Sabah, Fahad
- Subjects
- *
TRANSFORMER models , *QUALITY control standards , *PROCESS capability , *COMPUTER vision , *DEEP learning , *PRINTED circuit design - Abstract
An enhanced approach for detecting defects in Printed Circuit Boards (PCBs) using a significantly improved version of the YOLOv8 algorithm is proposed in this research, the proposed method is referred to as YOLOv8-TDD (You Only Look Once Version8-Targeted Defect Detection). This novel approach integrates cutting-edge components such as Swin Transformers, Dynamic Snake Convolution (DySnakeConv), and Biformer within the YOLOv8 architecture, aiming to address and overcome the limitations associated with traditional PCB inspection methods. The YOLOv8-TDD adaptation incorporates Swin Transformers to leverage hierarchical feature processing with shifted windows, enhancing the model's efficiency and capability in capturing complex image details. Dynamic Snake Convolution is implemented to dynamically adapt filter responses based on the input feature maps, offering tailored feature extraction that is highly responsive to the varied textures and defects in PCBs. The Biformer, with bidirectional processing capability, enriches the model's contextual understanding, providing a comprehensive analysis of the PCB images to pinpoint defects more accurately. Experimental results demonstrate that YOLOv8-TDD model, achieves a precision of 97.9%, a mean Average Precision (mAP0.5) of 95.71%. This enhanced model offers significant potential for practical applications in PCB manufacturing, promising to elevate quality control standards through more reliable defect detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Programming and Setting Up the Object Detection Algorithm YOLO to Determine Feeding Activities of Beef Cattle: A Comparison between YOLOv8m and YOLOv10m.
- Author
-
Guarnido-Lopez, Pablo, Ramirez-Agudelo, John-Fredy, Denimal, Emmanuel, and Benaouda, Mohammed
- Subjects
- *
OBJECT recognition (Computer vision) , *COMPUTER vision , *AGRICULTURE , *LIVESTOCK farms , *CATTLE feeding & feeds , *DETECTION algorithms - Abstract
Simple Summary: This study addresses the challenge of accurately monitoring the feeding behavior of cattle, which is crucial for their health and productivity. The aim was to compare two versions of a computer vision algorithm, YOLO (v8 vs. v10), which identifies objects in images, to evaluate how well they can recognize the feeding activities of beef cattle. By recording videos of bulls on a farm and analyzing them using YOLO algorithms, we found that both versions were effective at detecting these behaviors, but the latest version was slightly better and faster at learning. This new version also showed a reduced tendency to repeat errors. The conclusion is that the latest version of YOLO is more efficient and reliable for real-world use on farms. This advancement is valuable to society as it helps farmers better monitor and manage cattle feeding, leading to healthier animals and more efficient farming practices. This study highlights the importance of monitoring cattle feeding behavior using the YOLO algorithm for object detection. Videos of six Charolais bulls were recorded on a French farm, and three feeding behaviors (biting, chewing, visiting) were identified and labeled using Roboflow. YOLOv8 and YOLOv10 were compared for their performance in detecting these behaviors. YOLOv10 outperformed YOLOv8 with slightly higher precision, recall, mAP50, and mAP50-95 scores. Although both algorithms demonstrated similar overall accuracy (around 90%), YOLOv8 reached optimal training faster and exhibited less overfitting. Confusion matrices indicated similar patterns of prediction errors for both versions, but YOLOv10 showed better consistency. This study concludes that while both YOLOv8 and YOLOv10 are effective in detecting cattle feeding behaviors, YOLOv10 exhibited superior average performance, learning rate, and speed, making it more suitable for practical field applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Enhanced Helmet Wearing Detection Using Improved YOLO Algorithm.
- Author
-
Liuai Wu, Nannan Lu, Xiaotong Yao, and Yong Yang
- Subjects
OBJECT recognition (Computer vision) ,SAFETY hats ,COMPUTER vision ,RECOGNITION (Psychology) ,DEEP learning - Abstract
To address the accuracy limitations of existing safety helmet detection algorithms in complex environments, we propose an enhanced YOLOv8 algorithm, called YOLOv8- CSS. We introduce a Coordinate Attention (CA) mechanism in the backbone network to improve focus on safety helmet regions in complex backgrounds, suppress irrelevant feature interference, and enhance detection accuracy. We also incorporate the SEAM module to improve the detection and recognition of occluded objects, increasing robustness and accuracy. Additionally, we design a fine-neck structure to fuse features of different sizes from the backbone network, reducing model complexity while maintaining detection accuracy. Finally, we adopt the Wise-IoU loss function to optimize the training process, further enhancing detection accuracy. Experimental results show that YOLOv8-CSS significantly improves detection performance in general scenarios, complex backgrounds, and for distant small objects. YOLOv8-CSS improves precision, recall, mAP@0.5, and mAP@0.5:0.95 by 1.67%, 5.55%, 3.38%, and 5.87%, respectively, compared to YOLOv8n. Our algorithm also reduces model parameters by 21.25% and computational load by 15.89%. Comparisons with other mainstream object detection algorithms validate our approach's effectiveness and superiority. [ABSTRACT FROM AUTHOR]
- Published
- 2024
14. YOLO: A Competitive Analysis of Modern Object Detection Algorithms for Road Defects Detection Using Drone Images.
- Author
-
Sadhin, Amit Hasan, Mohd Hashim, Siti Zaiton, Samma, Hussein, and Khamis, Nurulaqilla
- Subjects
OBJECT recognition (Computer vision) ,CONVOLUTIONAL neural networks ,COMPUTER vision ,TRANSPORTATION safety measures ,TECHNOLOGICAL innovations - Abstract
Copyright of Baghdad Science Journal is the property of Republic of Iraq Ministry of Higher Education & Scientific Research (MOHESR) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
15. A real-time feeding behavior monitoring system for individual yak based on facial recognition model.
- Author
-
Yang, Yuxiang, Liu, Meiqi, Peng, Zhaoyuan, Deng, Yifan, Gu, Luhui, and Peng, Yingqi
- Subjects
ANIMAL behavior ,YAK ,COMPUTER vision ,IMAGE sensors ,WEIGHT gain - Abstract
Feeding behavior is known to affect the welfare and fattening efficiency of yaks in feedlots. With the advancement of machine vision and sensor technologies, the monitoring of animal behavior is progressively shifting from manual observation towards automated and stress-free methodologies. In this study, a real-time detection model for individual yak feeding and picking behavior was developed using YOLO series model and StrongSORT tracking model. In this study, we used videos collected from 11 yaks raised in two pens to train the yak face classification with YOLO series models and tracked their individual behavior using the StrongSORT tracking model. The yak behavior patterns detected in trough range were defined as feeding and picking, and the overall detection performance of these two behavior patterns was described using indicators such as accuracy, precision, recall, and F1-score. The improved YOLOv8 and Strongsort model achieved the best performance, with detection accuracy, precision, recall, and F1-score of 98.76%, 98.77%, 98.68%, and 98.72%, respectively. Yaks which have similar facial features have a chance of being confused with one another. A few yaks were misidentified because their faces were obscured by another yak's head or staff. The results showed that individual yak feeding behaviors can be accurately detected in real-time using the YOLO series and StrongSORT models, and this approach has the potential to be used for longer-term yak feeding monitoring. In the future, a dataset of yaks in various cultivate environments, group sizes, and lighting conditions will be included. Furthermore, the relationship between feeding time and yak weight gain will be investigated in order to predict livestock weight. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Efficient Fabric Classification and Object Detection Using YOLOv10.
- Author
-
Mao, Makara, Lee, Ahyoung, and Hong, Min
- Subjects
OBJECT recognition (Computer vision) ,CONVOLUTIONAL neural networks ,TEXTURED woven textiles ,COMPUTER vision ,DEEP learning ,INVENTORY control - Abstract
The YOLO (You Only Look Once) series is renowned for its real-time object detection capabilities in images and videos. It is highly relevant in industries like textiles, where speed and accuracy are critical. In the textile industry, accurate fabric type detection and classification are essential for improving quality control, optimizing inventory management, and enhancing customer satisfaction. This paper proposes a new approach using the YOLOv10 model, which offers enhanced detection accuracy, processing speed, and detection on the torn path of each type of fabric. We developed and utilized a specialized, annotated dataset featuring diverse textile samples, including cotton, hanbok, cotton yarn-dyed, and cotton blend plain fabrics, to detect the torn path in fabric. The YOLOv10 model was selected for its superior performance, leveraging advancements in deep learning architecture and applying data augmentation techniques to improve adaptability and generalization to the various textile patterns and textures. Through comprehensive experiments, we demonstrate the effectiveness of YOLOv10, which achieved an accuracy of 85.6% and outperformed previous YOLO variants in both precision and processing speed. Specifically, YOLOv10 showed a 2.4% improvement over YOLOv9, 1.8% over YOLOv8, 6.8% over YOLOv7, 5.6% over YOLOv6, and 6.2% over YOLOv5. These results underscore the significant potential of YOLOv10 in automating fabric detection processes, thereby enhancing operational efficiency and productivity in textile manufacturing and retail. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Classification of Coral Reef Species using Computer Vision and Deep Learning Techniques.
- Author
-
Alshahrani, Amal, Ali, Hanouf, Saif, Esra, Alsayed, Maha, and Alshareef, Fatimah
- Subjects
CORAL reef conservation ,CORAL reefs & islands ,MARINE biology ,CORALS ,COMPUTER vision ,MARINE biodiversity - Abstract
Coral reefs are among the most diverse and productive ecosystems, teeming with life and providing many benefits to marine life and human communities. Coral reef classification is popular for many important reasons, such as assessing biodiversity, prioritizing conservation actions to protect vulnerable species and their habitats, and many other objectives related to scientific research and interdisciplinary studies on marine ecosystems. Classifying images of coral reefs is challenging due to their great diversity and subtle differences in morphology. Manually classifying them is a time-consuming process, especially when dealing with large datasets. This can limit the scalability and efficiency of scientific research and conservation efforts. This study proposes an automated classification approach using computer vision and deep learning techniques to address these challenges, employing models such as YOLOv5l, YOLOv8l, and VGG16 to classify images of coral reefs. The dataset, comprising 1,187 images of five coral species, was augmented for robustness. YOLOv8l demonstrated superior performance with an accuracy of 97.8%, significantly outperforming the other models in terms of speed and accuracy. These results demonstrate the potential of advanced deep-learning models to improve coral reef monitoring and conservation efforts. This approach aims to streamline classification processes, improving the efficiency and scalability of coral reef research and conservation initiatives worldwide. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Enhanced Self-Checkout System for Retail Based on Improved YOLOv10.
- Author
-
Tan, Lianghao, Liu, Shubing, Gao, Jing, Liu, Xiaoyi, Chu, Linyue, and Jiang, Huangqi
- Subjects
COMPUTER vision ,LABOR costs ,PRODUCT improvement ,DEEP learning ,AUTOMATION ,RECOGNITION (Psychology) - Abstract
With the rapid advancement of deep learning technologies, computer vision has shown immense potential in retail automation. This paper presents a novel self-checkout system for retail based on an improved YOLOv10 network, aimed at enhancing checkout efficiency and reducing labor costs. We propose targeted optimizations for the YOLOv10 model, incorporating the detection head structure from YOLOv8, which significantly improves product recognition accuracy. Additionally, we develop a post-processing algorithm tailored for self-checkout scenarios, to further enhance the application of the system. Experimental results demonstrate that our system outperforms existing methods in both product recognition accuracy and checkout speed. This research not only provides a new technical solution for retail automation but offers valuable insights into optimizing deep learning models for real-world applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Deep learning based identification and tracking of railway bogie parts
- Author
-
Muhammad Zakir Shaikh, Zeeshan Ahmed, Enrique Nava Baro, Samreen Hussain, and Mariofanna Milanova
- Subjects
Computer Vision ,Object Detection ,Deep Learning ,Yolo ,Train Rolling Stock ,Wheelset ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
The Train Rolling-Stock Examination (TRSE) is a safety examination process that physically examines the bogie parts of a moving train, typically at speeds over 30 km/h. Currently, this inspection process is done manually by railway personnel in many countries to ensure safety and prevent interruptions to rail services. Although many earlier attempts have been made to semi-automate this process through computer-vision models, these models are iterative and still require manual intervention. Consequently, these attempts were unsuitable for real-time implementations. In this work, we propose a detection model by utilizing a deep-learning based classifier that can precisely identify bogie parts in real-time without manual intervention, allowing an increase in the deployability of these inspection systems. We implemented the Anchor-Free Yolov8 (AFYv8) model, which has a decoupled-head module for recognizing bogie parts. Additionally, we incorporated bogie parts tracking with the AFYv8 model to gather information about any missing parts. To test the effectiveness of the AFYv8-model, the bogie videos were captured at three different timestamps and the result shows the increase in the recognition accuracy of TRSE by 10 % compared to the previously developed classifiers. This research has the potential to enhance railway safety and minimize operational interruptions.
- Published
- 2024
- Full Text
- View/download PDF
20. Web service with machine learning model for airspace monitoring
- Author
-
A. D. Popov
- Subjects
uav ,radar ,computer vision ,web service ,detection ,architecture ,cameras ,flask framework ,dataset ,roboflow ,artificial intelligence ,ultralytics ,jupyterlab ,yolo ,opencv ,minio ,Technology - Abstract
Objective. The goal is to develop a web service for detecting aerial objects that detects a flying object, highlights it in an image and classifies it as a threat, since modern aerial object detection systems do not always cope with the task of detecting unmanned aerial vehicles due to their small size, low flight altitude and the use of materials that are barely noticeable to radar stations. Unmanned aerial vehicles that operate without operator control make it difficult to detect by radio signals. To detect UAVs, it is proposed to use a system based on optical scanning of the sky around protected objects. The system should be capable of autonomous operation and include aerial object detectors created on the basis of computer vision and artificial intelligence technologies. Method. The research and development of the airspace monitoring web service are based on the methods of system analysis, synthesis, and deduction. Result. The visual part of the web interface has been designed and developed; a dataset has been formed from open sources for the correct detection of flying objects; a neural network detector has been developed for classifying flying objects that pose a danger; a software module has been developed that allows for the automatic detection of identification flags of dangerous air objects with subsequent provision of reports in txt files in yolo format (coordinates are normalized). Conclusion. Separation of the visual part of the service will allow for distributed deployment of the server part, increasing flexibility and scalability. Development of the administrator control panel will allow for effective control of the service, management of settings and users. As a result, the web service will be able to: monitor the sky around protected objects, automatically detecting and classifying air objects and identifying air objects by threat level, providing information for taking necessary measures.
- Published
- 2024
- Full Text
- View/download PDF
21. Deep learning approach for detecting tomato flowers and buds in greenhouses on 3P2R gantry robot.
- Author
-
Singh, Rajmeet, Khan, Asim, Seneviratne, Lakmal, and Hussain, Irfan
- Subjects
- *
GREENHOUSES , *INDUSTRIAL robots , *DEEP learning , *TOMATOES , *FLOWERS , *COMPUTER vision , *BUDS - Abstract
In recent years, significant advancements have been made in the field of smart greenhouses, particularly in the application of computer vision and robotics for pollinating flowers. Robotic pollination offers several benefits, including reduced labor requirements and preservation of costly pollen through artificial tomato pollination. However, previous studies have primarily focused on the labeling and detection of tomato flowers alone. Therefore, the objective of this study was to develop a comprehensive methodology for simultaneously labeling, training, and detecting tomato flowers specifically tailored for robotic pollination. To achieve this, transfer learning techniques were employed using well-known models, namely YOLOv5 and the recently introduced YOLOv8, for tomato flower detection. The performance of both models was evaluated using the same image dataset, and a comparison was made based on their Average Precision (AP) scores to determine the superior model. The results indicated that YOLOv8 achieved a higher mean AP (mAP) of 92.6% in tomato flower and bud detection, outperforming YOLOv5 with 91.2%. Notably, YOLOv8 also demonstrated an inference speed of 0.7 ms when considering an image size of 1920 × 1080 pixels resized to 640 × 640 pixels during detection. The image dataset was acquired during both morning and evening periods to minimize the impact of lighting conditions on the detection model. These findings highlight the potential of YOLOv8 for real-time detection of tomato flowers and buds, enabling further estimation of flower blooming peaks and facilitating robotic pollination. In the context of robotic pollination, the study also focuses on the deployment of the proposed detection model on the 3P2R gantry robot. The study introduces a kinematic model and a modified circuit for the gantry robot. The position-based visual servoing method is employed to approach the detected flower during the pollination process. The effectiveness of the proposed visual servoing approach is validated in both un-clustered and clustered plant environments in the laboratory setting. Additionally, this study provides valuable theoretical and practical insights for specialists in the field of greenhouse systems, particularly in the design of flower detection algorithms using computer vision and its deployment in robotic systems used in greenhouses. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. A Method for Real-Time Lung Nodule Instance Segmentation Using Deep Learning.
- Author
-
Santone, Antonella, Mercaldo, Francesco, and Brunese, Luca
- Subjects
- *
OBJECT recognition (Computer vision) , *PULMONARY nodules , *STREAMING video & television , *COMPUTER vision , *TUMOR classification , *IMAGE segmentation , *DEEP learning - Abstract
Lung screening is really crucial in the early detection and management of masses, with particular regard to cancer. Studies have shown that lung cancer screening, can reduce lung cancer mortality by 20–30% in high-risk populations. In recent times, the advent of deep learning, with particular regard to computer vision, demonstrated the ability to effectively detect and locate objects from video streams and also (medical) images. Considering these aspects, in this paper, we propose a method aimed to perform instance segmentation, i.e., by providing a mask for each lung mass instance detected, allowing for the identification of individual masses even if they overlap or are close to each other by classifying the detected masses into (generic) nodules, cancer or adenocarcinoma. In this paper, we considered the you-only-look-once model for lung nodule segmentation. An experimental analysis, performed on a set of real-world lung computed tomography images, demonstrated the effectiveness of the proposed method not only in the detection of lung masses but also in lung mass segmentation, thus providing a helpful way not only for radiologist to conduct automatic lung screening but also for discovering very small masses not easily recognizable to the naked eye and that may deserve attention. As a matter of fact, in the evaluation of a dataset composed of 3654 lung scans, the proposed method obtains an average precision of 0.757 and an average recall of 0.738 in the classification task. Additionally, it reaches an average mask precision of 0.75 and an average mask recall of 0.733. These results indicate that the proposed method is capable of not only classifying masses as nodules, cancer, and adenocarcinoma, but also effectively segmenting the areas, thereby performing instance segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Flower Visitation through the Lens: Exploring the Foraging Behaviour of Bombus terrestris with a Computer Vision-Based Application.
- Author
-
Varga-Szilay, Zsófia, Szövényi, Gergely, and Pozsgai, Gábor
- Subjects
- *
BOMBUS terrestris , *BUMBLEBEES , *ARTIFICIAL intelligence , *DEEP learning , *PLANT species - Abstract
Simple Summary: To understand the processes behind the decline of pollinators, it is also essential to gain insight into their behaviour and identify the factors that drive it. This study focuses on the foraging behaviour of wild bumblebees in urban areas of Terceira, Azores, Portugal. We video-recorded buff-tailed bumblebees on flowering patches of Cretan bird's-foot trefoil, pink-headed knotweed, and red clover for five-minute intervals. We used computer vision-based deep learning models to detect bumblebees. Our results showed that flower cover was the only factor influencing the attractiveness of flower patches for flower-visiting bumblebees, while plant species had no effect. The time bumblebees spent on inflorescences was longer compared to their travelling time between inflorescences on the large-headed red clover than on the smaller-headed trefoil and knotweed. However, the overall time bumblebees spent on the inflorescences did not significantly differ among the plant species. Since our computer vision-based model achieved high accuracy in finding bumblebees on the target plant species, we confirmed that AI-based solutions can provide methods for studying pollinator behaviour and offer valuable insights to support conservation efforts. To understand the processes behind pollinator declines and for the conservation of pollination services, we need to understand fundamental drivers influencing pollinator behaviour. Here, we aimed to elucidate how wild bumblebees interact with three plant species and investigated their foraging behaviour with varying flower densities. We video-recorded Bombus terrestris in 60 × 60 cm quadrats of Lotus creticus, Persicaria capitata, and Trifolium pratense in urban areas of Terceira (Azores, Portugal). For the automated bumblebee detection and counting, we created deep learning-based computer vision models with custom datasets. We achieved high model accuracy of 0.88 for Lotus and Persicaria and 0.95 for Trifolium, indicating accurate bumblebee detection. In our study, flower cover was the only factor that influenced the attractiveness of flower patches, and plant species did not have an effect. We detected a significant positive effect of flower cover on the attractiveness of flower patches for flower-visiting bumblebees. The time spent per unit of inflorescence surface area was longer on the Trifolium than those on the Lotus and Persicaria. However, our result did not indicate significant differences in the time bumblebees spent on inflorescences among the three plant species. Here, we also justify computer vision-based analysis as a reliable tool for studying pollinator behavioural ecology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. A More Efficient Algorithm for Small Target Detection in Unmanned Aerial Vehicles.
- Author
-
Zhang, Yuechong, Dong, Dehao, Liu, Haiying, Liu, Lida, Deng, Lixia, Gu, Jason, and Li, Shuang
- Subjects
- *
OBJECT recognition (Computer vision) , *COMPUTER vision , *ELECTRICAL engineers , *PERIODICAL publishing , *PROBLEM solving - Abstract
Due to the relatively high shooting altitude of unmanned aerial vehicles (UAV), the captured images often contain a multitude of small‐scale targets. To solve the problems of small target scale, lack of semantic information, and high miss detection in drone target detection, in this paper we proposed a more effective unmanned aerial vehicle small target detection algorithm(MEU‐YOLOv5) based on YOLOv5s. Firstly, an efficient global contextual module is proposed to enhance the algorithm's performance in feature extraction while reducing the excessive loss of shallow features. Secondly, a small‐scale target detector is added to enhance the algorithm's detection capability for smaller targets. Lastly, a recursive multi‐level feature fusion path is introduced to better fuse the shallow and deep features of the images, reducing overfitting and improving the algorithm's generalizability and robustness. Experimental results demonstrated that compared to YOLOv5s, MEU‐YOLOv5 achieves a 7.4% improvement in mAP@0.5 and a 4.9% improvement in mAP@0.5:0.95. Additionally, the overall performance of this algorithm surpassed various algorithms in the YOLO series, including YOLOv3, YOLOv5l, YOLOv5m, and YOLOv8s. © 2024 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Enhancing Steel Surface Defect Detection: A Hyper-YOLO Approach with Ghost Modules and Hyper FPN.
- Author
-
Guinan Wu and Qinghong Wu
- Subjects
STEEL strip ,OBJECT recognition (Computer vision) ,SURFACE defects ,COMPUTER vision ,STEEL industry - Abstract
Steel surface defect detection poses a significant challenge in the steel industry, aiming to enhance product quality and production efficiency. Traditional mechanical and optical detection methods exhibit relatively low efficiency and poor real-time performance in detecting acceptable defects on the surface of steel strips. This paper proposes a new model named Hyper-YOLO for steel surface defect detection in steel strips. Firstly, the CSP module in the conventional YOLO backbone network is replaced with the Ghost module. The Ghost module, a lightweight convolutional module, enhances model efficiency by reducing parameter count and computational load while maintaining satisfactory performance. Secondly, researchers replace the PAFPN module in YOLO V5 in the bottleneck section with the Hyper FPN module. Hyper FPN, an improved feature pyramid network module, leverages features at different scales for multi-level feature fusion, enhancing the model's capability to detect targets at various scales. Lastly, improvements are made in the loss functions for both training and prediction stages. The α-CIoU loss function is introduced during training to substitute the original CIoU loss function, and the α-DIoU loss function is utilized during prediction instead of the original DIoU loss function. These enhanced loss functions effectively measure the accuracy and position precision of target boxes, thereby improving the detection performance of the model. Through these enhancements, the Hyper YOLO model achieves an overall performance improvement of 4.58% over the baseline model. This indicates that Hyper YOLO performs outstanding surface defect detection in steel strips, providing innovative insights for the YOLO V5 model. These improvements not only elevate the accuracy and efficiency of the model but also hold significant guidance for similar research and applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
26. In-Depth Review of YOLOv1 to YOLOv10 Variants for Enhanced Photovoltaic Defect Detection.
- Author
-
Hussain, Muhammad and Khanam, Rahima
- Subjects
COMPUTER vision ,CONVOLUTIONAL neural networks ,DEEP learning ,MICROCRACKS ,INFORMATION retrieval - Abstract
This review presents an investigation into the incremental advancements in the YOLO (You Only Look Once) architecture and its derivatives, with a specific focus on their pivotal contributions to improving quality inspection within the photovoltaic (PV) domain. YOLO's single-stage approach to object detection has made it a preferred option due to its efficiency. The review unearths key drivers of success in each variant, from path aggregation networks to generalised efficient layer aggregation architectures and programmable gradient information, presented in the latest variant, YOLOv10, released in May 2024. Looking ahead, the review predicts a significant trend in future research, indicating a shift toward refining YOLO variants to tackle a wider array of PV fault scenarios. While current discussions mainly centre on micro-crack detection, there is an acknowledged opportunity for expansion. Researchers are expected to delve deeper into attention mechanisms within the YOLO architecture, recognising their potential to greatly enhance detection capabilities, particularly for subtle and intricate faults. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Wheat Powdery Mildew Detection with YOLOv8 Object Detection Model.
- Author
-
Önler, Eray and Köycü, Nagehan Desen
- Subjects
OBJECT recognition (Computer vision) ,ARTIFICIAL neural networks ,COMPUTER vision ,POWDERY mildew diseases ,AGRICULTURE - Abstract
Wheat powdery mildew is a fungal disease that significantly impacts wheat yield and quality. Controlling this disease requires the use of resistant varieties, fungicides, crop rotation, and proper sanitation. Precision agriculture focuses on the strategic use of agricultural inputs to maximize benefits while minimizing environmental and human health effects. Object detection using computer vision enables selective spraying of pesticides, allowing for targeted application. Traditional detection methods rely on manually crafted features, while deep learning-based methods use deep neural networks to learn features autonomously from the data. You Look Only Once (YOLO) and other one-stage detectors are advantageous due to their speed and competition. This research aimed to design a model to detect powdery mildew in wheat using digital images. Multiple YOLOv8 models were trained with a custom dataset of images collected from trial areas at Tekirdag Namik Kemal University. The YOLOv8m model demonstrated the highest precision, recall, F1, and average precision values of 0.79, 0.74, 0.770, 0.76, and 0.35, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. A Method for Finding Distance in Real-Time Car Detection through Object Detection.
- Author
-
Martinelli, Fabio, Mercaldo, Francesco, and Santone, Antonella
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,DEEP learning ,STREAMING video & television ,SMART cities - Abstract
The rapid evolution of deep learning techniques, applied in the smart cities context, has revolutionized computer vision applications, with particular significance in the field of object detection. In this paper, we explore the application of deep learning for real-time car detection in visual data, such as images or video streams. The proposed deep learning model is trained on large-scale datasets containing diverse images of cars, encompassing various lighting conditions, weather patterns, and traffic scenarios. Moreover, once the presence of car(s) is detected, we consider finding the distance between the detected car(s) and the camera: in this way, it is possible to understand the distance to the vehicle to take a certain countermeasure (for example braking or slowing down). The experimental analysis, performed on a dataset composed of 25532 different images confirms the effectiveness of the proposed method for real-world car detection. Moreover, examples of the distance computed between the camera and the detected cars are shown to provide samples of the adoption of the proposed method in the real-world environment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Multi-Task Intelligent Monitoring of Construction Safety Based on Computer Vision.
- Author
-
Liu, Lingfeng, Guo, Zhigang, Liu, Zhengxiong, Zhang, Yaolin, Cai, Ruying, Hu, Xin, Yang, Ran, and Wang, Gang
- Subjects
OBJECT recognition (Computer vision) ,BUILDING inspection ,COMPUTER vision ,INSPECTION & review ,TRACKING algorithms ,VIDEO surveillance ,DEEP learning - Abstract
Effective safety management is vital for ensuring construction safety. Traditional safety inspections in construction heavily rely on manual labor, which is both time-consuming and labor-intensive. Extensive research has been conducted integrating computer-vision technologies to facilitate intelligent surveillance and improve safety measures. However, existing research predominantly focuses on singular tasks, while construction environments necessitate comprehensive analysis. This study introduces a multi-task computer vision technology approach for the enhanced monitoring of construction safety. The process begins with the collection and processing of multi-source video surveillance data. Subsequently, YOLOv8, a deep learning-based computer vision model, is adapted to meet specific task requirements by modifying the head component of the framework. This adaptation enables efficient detection and segmentation of construction elements, as well as the estimation of person and machine poses. Moreover, a tracking algorithm integrates these capabilities to continuously monitor detected elements, thereby facilitating the proactive identification of unsafe practices on construction sites. This paper also presents a novel Integrated Excavator Pose (IEP) dataset designed to address the common challenges associated with different single datasets, thereby ensuring accurate detection and robust application in practical scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Application of Synthetic Data on Object Detection Tasks.
- Author
-
Huu Long Nguyen, Duc Toan Le, and Hong Hai Hoang
- Subjects
COMPUTER vision ,SIMULATION methods & models ,DATA modeling - Abstract
Object detection is a computer vision task that identifies and locates one or more effective targets from image or video data. The accuracy of object detection heavily depends on the size and the diversity of the utilized dataset. However, preparing and labeling an adequate dataset to guarantee a high level of reliability can be time-consuming and labor-intensive, because the process of building data requires manually setting up the environment and capturing the dataset while keeping its variety in scenarios. There have been several efforts on object detection that take a long time to prepare the input data for training the models. To deal with this problem, synthetic data have emerged as a potential for the replacement of real-world data in data preparation for model training. In this paper, we provide a technique that can generate an enormous synthetic dataset with little human labor. Concretely, we have simulated the environment by applying the pyBullet library and capturing various types of input images. In order to examine its performance on the training model, we integrated a YOLOv5 object detection model to investigate the dataset. The output of the conducted model was deployed in a simulation robot system to examine its potential. YOLOv5 can reach a high accuracy of object detection at 93.1% mAP when solely training on our generated data. Our research provides a novelistic method to facilitate the understanding of the data generation process in preparing datasets for deep learning models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. A Children's Psychological and Mental Health Detection Model by Drawing Analysis based on Computer Vision and Deep Learning.
- Author
-
Alshahrani, Amal, Almatrafi, Manar Mohammed, Mustafa, Jenan Ibrahim, Albaqami, Layan Saad, and Aljabri, Raneem Abdulrahman
- Subjects
CHILDREN'S drawings ,ARTIFICIAL intelligence ,CLINICAL health psychology ,MENTAL health ,EMOTIONS - Abstract
Nowadays, children face different changes and challenges from an early age, which can have long-lasting impacts on them. Many children struggle to express or explain their feelings and thoughts properly. Due to that fact, psychological and mental health specialists found a way to detect mental issues by observing and analyzing different signs in children’s drawings. Yet, this process remains complex and time-consuming. This study proposes a solution by employing artificial intelligence to analyze children’s drawings and provide diagnosis rates with high accuracy. While prior research has focused on detecting psychological and mental issues through questionnaires, only one study has explored analyzing emotions in children's drawings by detecting positive and negative feelings. A notable gap is the limited diagnosis of specific mental issues, along with the promising accuracy of the detection results. In this study, different versions of YOLO were trained on a dataset of 500 drawings, split into 80% for training, 10% for validation, and 10% for testing. Each drawing was annotated with one or more emotional labels: happy, sad, anxiety, anger, and aggression. YOLOv8-cls, YOLOv9, and ResNet50 were used for object detection and classification, achieving accuracies of 94%, 95.1%, and 70.3%, respectively. YOLOv9 and ResNet50 results were obtained at high epoch numbers with large model sizes of 5.26 MB and 94.3 MB. YOLOv8-cls achieved the most satisfying result, reaching a high accuracy of 94% after 10 epochs with a compact model size of 2.83 MB, effectively meeting the study's goals. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Improved YOLOv5 Algorithm for Oriented Object Detection of Aerial Image.
- Author
-
Gang Yang, Miao Wang, Quan Zhou, Jiangchuan Li, Siyue Zhou, and Yutong Lu
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,REMOTE sensing ,ALGORITHMS ,IMAGE sensors ,TRACKING algorithms ,HOUGH transforms - Abstract
With the development of computer vision and remote sensor devices, object detection in aerial images has drawn considerable attention because of its ability to provide a wide field of view and a large amount of information. Despite this, object detection in aerial images is a challenging task owing to densely packed objects, oriented diversity, and complex background. In this study, we optimized three aspects of the YOLOv5 algorithm to detect arbitrary oriented objects in remote sensing images, including head structure, features from the backbone, and angle prediction. To improve the head structure, we decoupled it into four submodules, which are used for object localization, foreground, category, and oriented angle classification. To increase the accuracy of the features from the backbone, we designed a block dimensional attention module, which is developed by splitting the image into smaller patches based on a dimensional attention module. Compared with the original YOLOv5 algorithm, our approach has a better performance for oriented object detection-the mAP on DOTA-v1.5 is increased by 1.25%. It was tested to be effective on DOTA-vl.0, HRSC2016, and DIOR-R datasets as well. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. CycleInSight: An enhanced YOLO approach for vulnerable cyclist detection in urban environments.
- Author
-
Narkhede, Manish and Chopade, Nilkanth
- Subjects
DRIVER assistance systems ,OBJECT recognition (Computer vision) ,COMPUTER vision ,CITY traffic ,DEEP learning ,AUTONOMOUS vehicles - Abstract
As urbanization continues to reshape transportation, the safety of cyclists in complex traffic environments has become a pressing concern. In response to this challenge, our research introduces a CycleInSight framework, which harnesses advanced deep learning and computer vision techniques to enable precise and efficient cyclist detection in diverse urban settings. Utilizing you only look once version 8 (YOLOv8) object detection algorithm, the proposed model aims to detect and localize vulnerable cyclists near vehicles equipped with onboard cameras. Our research presents comprehensive experimental results demonstrating its effectiveness in identifying vulnerable cyclists amidst dynamic and challenging traffic conditions. With an impressive average precision of 90.91%, our approach outperforms existing models while maintaining efficient inference speeds. By effectively identifying and tracking cyclists, this framework holds significant potential to enhance urban traffic safety, inform data-driven infrastructure planning, and support the development of advanced driver assistance systems and autonomous vehicles. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Development of deep multi-animal tracking for domestic cats toward future application to social behavior analysis.
- Author
-
Nakajima, Nina, Koyasu, Hikari, Maruno, Yuki, Nagasawa, Miho, Kikusui, Takefumi, and Kubo, Takatomi
- Subjects
- *
ANIMAL social behavior , *CATS , *ANIMAL tracks , *COMPUTER vision , *BEHAVIORAL assessment - Abstract
Computer vision models such as You Only Look Once (YOLO) excel at tracking general objects, yet they often struggle to accurately track multiple animals. In this study, we aimed to verify the improvement of animal tracking performance using YOLO by fine-tuning. We obtained videos of multiple domestic cats by ourselves and annotated the videos, and performed fine-tuning with this video dataset. Results of this study show that even training with less than one hundred images improves accuracy and enables tracking of individual animals. This result suggests that our method is promising for the realization of animal social behavior analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. 基于 YOLOv5 的马铃薯种薯芽眼轻量化检测算法.
- Author
-
顾洪宇, 李志合, 李 涛, 李天豪, 李 宁, and 魏忠彩
- Abstract
Potatoes have been a versatile food and cash crop to ensure food security and grain planting structure in China. The total planting area has declined in the planting industries at present. Some challenges also remain in the mechanization of the potato planting industry. The current process of seed potato cutting can rely heavily on manual and mechanical blind cutting, leading to labor-intensive and inefficient tasks. Moreover, there is a high rate of blind cuts and significant loss of seed potatoes. Therefore, it is highly urgent to accurately and rapidly identify the potato eyes before cutting. In this study, an improved model was proposed to detect the potato bud eyes using YOLOv5. Dutch 15 potato variety was taken as the experimental material. The high-quality samples of seed potatoes were carefully chosen to be free of diseases, dry rot, disease spots, and worm eyes. The dataset was used for training, verification, and testing. 1 400 pictures of seed potatoes were obtained with a ratio of 8:1:1. Data expansion techniques (such as mirroring, rotating, cropping, and adjusting brightness) were applied to enhance the dataset, thus resulting in a total of 5 600 images. Since the features of seed potato eyes were relatively simple, there was a decrease to even disappear after multiple convolutions. C3 Faster was integrated into the original framework. While the extraction of sprout eye features was enhanced to reduce the parameters. Additionally, the GD structure was incorporated from the Neck component of GOLD-YOLO to improve the detection accuracy of sprout eye. The bounding box loss function CIoU Loss was replaced with WIoU Loss to expedite the convergence of the network model for high detection accuracy. The hyperparameters of the original YOLO model were optimized for the COCO dataset. There were significant differences from the dataset of seed potato sprout eye in this experiment. Therefore, a genetic algorithm (GA) was employed to fine-tune the hyperparameters specifically for the detection of seed potato sprout eye at the end of the experiment. Furthermore, pruning and distillation techniques were used to reduce the running parameters and memory consumption, in order to make the model more suitable for the detection tasks of seed potato bud eye. The optimized model was presented with a size of 8.7 MB, which was only 61.3% of the original model. Params comprised approximately 57.1% of the original model. The final average accuracies of detection were 90.5% and 90.1%, respectively, in the test and validation sets of the self-made potato dataset. Compared with the lightweight networks YOLOv7-tiny, YOLOv8n, YOLOv5n, and YOLOv5s, the average accuracies of the improved model were 0.5, 1.3, 2.8, and 1.1 percentage points higher, respectively, in the test set of seed potato dataset. The average accuracy of the verification set was 2.9, 1.9, 3.2, and 1.6 percentage points higher. The detection speed on a local computer reached 27.5 frames per second, fully meeting the real-time requirements. Substantial benefits were offered to significantly enhance the efficiency and accuracy of seed potato bud eye detection in the potato cultivation industry. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Identification and Localization of Wind Turbine Blade Faults Using Deep Learning.
- Author
-
Davis, Mason, Nazario Dejesus, Edwin, Shekaramiz, Mohammad, Zander, Joshua, and Memari, Majid
- Subjects
WIND turbine blades ,COMPUTER vision ,WIND turbines ,COMPUTATIONAL complexity ,EROSION - Abstract
This study addresses the challenges inherent in the maintenance and inspection of wind turbines through the application of deep learning methodologies for fault detection on Wind Turbine Blades (WTBs). Specifically, this research focuses on defect detection on the blades of small-scale WTBs due to the unavailability of commercial wind turbines. This research compared popular object localization architectures, YOLO and Mask R-CNN, to identify the most effective model to detect common WTB defects, including cracks, holes, and erosion. YOLOv9 C emerged as the most effective model, with the highest scores of mAP50 and mAP50-95 of 0.849 and 0.539, respectively. Modifications to Mask R-CNN, specifically integrating a ResNet18-FPN network, reduced computational complexity by 32 layers and achieved a mAP50 of 0.8415. The findings highlight the potential of deep learning and computer vision in improving WTB fault analysis and inspection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Implementing YOLO Convolutional Neural Network for Seed Size Detection.
- Author
-
Pawłowski, Jakub, Kołodziej, Marcin, and Majkowski, Andrzej
- Subjects
CONVOLUTIONAL neural networks ,COMPUTER vision ,SEED size ,IMAGE processing ,DATABASES - Abstract
The article presents research on the application of image processing techniques and convolutional neural networks (CNN) for the detection and measurement of seed sizes, specifically focusing on coffee and white bean seeds. The primary objective of the study is to evaluate the potential of using CNNs to develop tools that automate seed recognition and measurement in images. A database was created, containing photographs of coffee and white bean seeds with precise annotations of their location and type. Image processing techniques and You Only Look Once v8 (YOLO) models were employed to analyze the seeds' position, size, and type. A detailed comparison of the effectiveness and performance of the applied methods was conducted. The experiments demonstrated that the best-trained CNN model achieved a segmentation accuracy of 90.1% IoU, with an average seed size error of 0.58 mm. The conclusions indicate a significant potential for using image processing techniques and CNN models in automating seed analysis processes, which could lead to increased efficiency and accuracy in these processes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Comprehensive Study of YOLO Versions for Front and Rear-View Classification of Vehicles in Context of Indian Roads.
- Author
-
Rath, Manas Kumar and Swain, Prasanta Kumar
- Subjects
- *
CONVOLUTIONAL neural networks , *COMPUTER vision , *DEEP learning , *COMPARATIVE studies , *CLASSIFICATION , *HUMANITY - Abstract
Ever since Computer Vision was introduced, humanity has seen various ways to detect or classify objects of various types. Depending upon the context in consideration, the performances of models vary with respect to their evolution or even upon the nature of the data in hand. The classification of front or rear views in vehicles forms an integral part when we go ahead with deciding whether a given vehicle is moving in the correct lane. In the context of Indian streets, we have various challenges like rural unmarked roads, faded markings, shaded situations from poles or trees, etc. Hence instead of detecting lanes, an alternative way is to detect whether the vehicle(s) ahead is facing toward or away from our vehicle. Various deep learning architectures have been proposed in this aspect to detect or classify objects like the networks from Visual Geometry Group, You Only Look Once, Inception Networks, Residual Networks, etc. In this paper, we have performed a comparative analysis of performance on various versions of You Only Look Once for its evolution over time. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Application of Object Detection Algorithm for Efficient Damages Identification of the Conservation of Heritage Buildings.
- Author
-
Tang, Huadu, Feng, Yalin, Wang, Ding, Zhu, Ruiguang, Wang, Liwei, Hao, Shengwang, and Xu, Shan
- Subjects
- *
OBJECT recognition (Computer vision) , *BUILDING maintenance , *WEATHERING , *PRESERVATION of architecture , *COMPUTER vision - Abstract
Heritage buildings are crucial for any area's cultural and political aspects. Proper maintenance and monitoring are essential for the conservation of these buildings. However, manual inspections are time‐consuming and expensive. We propose a deep learning–based detection framework to identify the damages on the ancient architectural wall. The algorithm applied in this study is YOLOv5. Comparing its five different versions, it was decided to use YOLOv5m as the most accurate detection algorithm with a mAP of 0.801. The damage types identified are physical weathering and visitors' scratches. High‐resolution images were selected for the experiment and effectively identified image. In addition, the applied algorithm allows real‐time detection and the identification of seasonal sources of disruption, which is proved by the video test in this study. The findings contribute to the development of an intelligent tool for health monitoring with the goal of fast and remote damage detection in the routine maintenance of heritage buildings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Lizard Body Temperature Acquisition and Lizard Recognition Using Artificial Intelligence.
- Author
-
Afonso, Ana L., Lopes, Gil, and Ribeiro, A. Fernando
- Subjects
- *
ARTIFICIAL intelligence , *THERMAL imaging cameras , *BODY temperature , *PROSTHETICS , *LIZARDS - Abstract
The acquisition of the body temperature of animals kept in captivity in biology laboratories is crucial for several studies in the field of animal biology. Traditionally, the acquisition process was carried out manually, which does not guarantee much accuracy or consistency in the acquired data and was painful for the animal. The process was then switched to a semi-manual process using a thermal camera, but it still involved manually clicking on each part of the animal's body every 20 s of the video to obtain temperature values, making it a time-consuming, non-automatic, and difficult process. This project aims to automate this acquisition process through the automatic recognition of parts of a lizard's body, reading the temperature in these parts based on a video taken with two cameras simultaneously: an RGB camera and a thermal camera. The first camera detects the location of the lizard's various body parts using artificial intelligence techniques, and the second camera allows reading of the respective temperature of each part. Due to the lack of lizard datasets, either in the biology laboratory or online, a dataset had to be created from scratch, containing the identification of the lizard and six of its body parts. YOLOv5 was used to detect the lizard and its body parts in RGB images, achieving a precision of 90.00% and a recall of 98.80%. After initial calibration, the RGB and thermal camera images are properly localised, making it possible to know the lizard's position, even when the lizard is at the same temperature as its surrounding environment, through a coordinate conversion from the RGB image to the thermal image. The thermal image has a colour temperature scale with the respective maximum and minimum temperature values, which is used to read each pixel of the thermal image, thus allowing the correct temperature to be read in each part of the lizard. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. A Neural-Network-Based Cost-Effective Method for Initial Weld Point Extraction from 2D Images.
- Author
-
Lopez-Fuster, Miguel-Angel, Morgado-Estevez, Arturo, Diaz-Cano, Ignacio, and Badesa, Francisco J.
- Subjects
ROBOTIC welding ,OBJECT recognition (Computer vision) ,COMPUTER vision ,WELDING ,DEEP learning - Abstract
This paper presents a novel approach for extracting 3D weld point information using a two-stage deep learning pipeline based on readily available 2D RGB cameras. Our method utilizes YOLOv8s for object detection, specifically targeting vertices, followed by semantic segmentation for precise pixel localization. This pipeline addresses the challenges posed by low-contrast images and complex geometries, significantly reducing costs compared with traditional 3D-based solutions. We demonstrated the effectiveness of our approach through a comparison with a 3D-point-cloud-based method, showcasing the potential for improved speed and efficiency. This research advances the field of automated welding by providing a cost-effective and versatile solution for extracting key information from 2D images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Color Histogram Contouring: A New Training-Less Approach to Object Detection.
- Author
-
Rabie, Tamer, Baziyad, Mohammed, Sani, Radhwan, Bonny, Talal, and Fareh, Raouf
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,HISTOGRAMS ,AUTONOMOUS robots ,BIN packing problem ,COLOR ,MOBILE robots - Abstract
This paper introduces the Color Histogram Contouring (CHC) method, a new training-less approach to object detection that emphasizes the distinctive features in chrominance components. By building a chrominance-rich feature vector with a bin size of 1, the proposed CHC method exploits the precise information in chrominance features without increasing bin sizes, which can lead to false detections. This feature vector demonstrates invariance to lighting changes and is designed to mimic the opponent color axes used by the human visual system. The proposed CHC algorithm iterates over non-zero histogram bins of unique color features in the model, creating a feature vector for each, and emphasizes those matching in both the scene and model histograms. When both model and scene histograms for these unique features align, it ensures the presence of the model in the scene image. Extensive experiments across various scenarios show that the proposed CHC technique outperforms the benchmark training-less Swain and Ballard method and the algorithm of Viola and Jones. Additionally, a comparative experiment with the state-of-the-art You Only Look Once (YOLO) technique reveals that the proposed CHC technique surpasses YOLO in scenarios with limited training data, highlighting a significant advancement in training-less object detection. This approach offers a valuable addition to computer vision, providing an effective training-less solution for real-time autonomous robot localization and mapping in unknown environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Comparative Evaluation of Convolutional Neural Network Object Detection Algorithms for Vehicle Detection.
- Author
-
Reddy, Saieshan, Pillay, Nelendran, and Singh, Navin
- Subjects
OBJECT recognition (Computer vision) ,CONVOLUTIONAL neural networks ,COMPUTER vision ,VISUAL fields ,DETECTORS - Abstract
The domain of object detection was revolutionized with the introduction of Convolutional Neural Networks (CNNs) in the field of computer vision. This article aims to explore the architectural intricacies, methodological differences, and performance characteristics of three CNN-based object detection algorithms, namely Faster Region-Based Convolutional Network (R-CNN), You Only Look Once v3 (YOLO), and Single Shot MultiBox Detector (SSD) in the specific domain application of vehicle detection. The findings of this study indicate that the SSD object detection algorithm outperforms the other approaches in terms of both performance and processing speeds. The Faster R-CNN approach detected objects in images with an average speed of 5.1 s, achieving a mean average precision of 0.76 and an average loss of 0.467. YOLO v3 detected objects with an average speed of 1.16 s, achieving a mean average precision of 0.81 with an average loss of 1.183. In contrast, SSD detected objects with an average speed of 0.5 s, exhibiting the highest mean average precision of 0.92 despite having a higher average loss of 2.625. Notably, all three object detectors achieved an accuracy exceeding 99%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. MSCA-YOLO: A YOLOv5-based Steel Defect Detection Method Enhanced with Multi-Scale Feature Extraction and Contextual Augmentation.
- Author
-
Yao Wang, Chengxin Liang, Xiao Wang, and Yushan Liu
- Subjects
SURFACE defects ,COMPUTER vision ,VISUAL fields ,QUALITY control ,STEEL - Abstract
Steel surface defect detection in industrial quality control has always been a challenging objective detection task in the field of computer vision. However, unlike other detection problems, some surface defects on steel are relatively small compared to the entire inspection object, leading to less prominent defect features in the detection. To address these issues, we propose a YOLOv5 -based steel defect detection method enhanced with multi-scale feature extraction and contextual augmentation (MSCA-YOLO). Specifically, adopting the YOLOv5 as the backbone network, we first add the 03-RFE to expand the receptive. Then, we design a neck network structure via combining multi -scale guided upsampling, which effectively enhances the model's ability to handle multi-scale features and improves the model's feature extraction ability for small defects. Finally, we propose a context mechanism that provides the model with a deeper context analysis capability, offering richer up-and-down information. The experiments on the NEU-DET dataset show that MSCA-YOLO achieves a mean Average Precision of 0.645 while maintaining rapid detection, especially at an Intersection over Union threshold of 0. 5. It also exhibits substantial improvements in Precision compared to YOLOv5 across six defect types: Grazing (18. 5% increase), Inclusion (1. 2% increase), Patches (1.9% increase), Pitted_Surface (7.8% increase), Rolled-in_Scale (8.9% increase), and Scratches (6.5% increase). This achievement marks the efficiency and reliability of MSCA-YOLO in automated steel surface defect detection, providing a new solution for real-time inspection of steel surface defects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. YOLOv8-DEE: a high-precision model for printed circuit board defect detection
- Author
-
Feifan Yi, Ahmad Sufril Azlan Mohamed, Mohd Halim Mohd Noor, Fakhrozi Che Ani, and Zol Effendi Zolkefli
- Subjects
Defects detection ,Computer vision ,PCB ,Deep learning ,YOLO ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Defects in printed circuit boards (PCBs) occurring during the production process of consumer electronic products can have a substantial impact on product quality, compromising both stability and reliability. Despite considerable efforts in PCB defect inspection, current detection models struggle with accuracy due to complex backgrounds and multi-scale characteristics of PCB defects. This article introduces a novel network, YOLOv8-DSC-EMA-EIoU (YOLOv8-DEE), to address these challenges by enhancing the YOLOv8-L model. Firstly, an improved backbone network incorporating depthwise separable convolution (DSC) modules is designed to enhance the network’s ability to extract PCB defect features. Secondly, an efficient multi-scale attention (EMA) module is introduced in the network’s neck to improve contextual information interaction within complex PCB images. Lastly, the original complete intersection over union (CIoU) is replaced with efficient intersection over union (EIoU) to better highlight defect locations and accommodate varying sizes and aspect ratios, thereby enhancing detection accuracy. Experimental results show that YOLOv8-DEE achieves a mean average precision (mAP) of 97.5% and 98.7% on the HRIPCB and DeepPCB datasets, respectively, improving by 2.5% and 0.7% compared to YOLOv8-L. Additionally, YOLOv8-DEE outperforms other state-of-the-art methods in defect detection, demonstrating significant improvements in detecting small, medium, and large PCB defects.
- Published
- 2024
- Full Text
- View/download PDF
46. YOLO deep learning algorithm for object detection in agriculture: a review
- Author
-
Kamalesh Kanna S, Kumaraperumal Ramalingam, Pazhanivelan P, Jagadeeswaran R, and Prabu P.C.
- Subjects
Agriculture ,computer vision ,deep learning ,object detection ,real-time farming ,YOLO ,Agriculture (General) ,S1-972 - Abstract
YOLO represents the one-stage object detection also called regression-based object detection. Object in the given input is directly classified and located instead of using the candidate region. The accuracy from two-stage detection is higher than one-stage detection where one-stage object detection speed is higher than two-stage object detection. YOLO has become popular because of its Detection accuracy, good generalization, open-source, and speed. YOLO boasts exceptional speed due to its approach of using regression problems for frame detection, eliminating the need for a complex pipeline. In agriculture, using remote sensing and drone technologies YOLO classifies and detects crops, diseases, and pests, and is also used for land use mapping, environmental monitoring, urban planning, and wildlife. Recent research highlights YOLO's impressive performance in various agricultural applications. For instance, YOLOv4 demonstrated high accuracy in counting and locating small objects in UAV-captured images of bean plants, achieving an AP of 84.8% and a recall of 89%. Similarly, YOLOv5 showed significant precision in identifying rice leaf diseases, with a precision rate of 90%. In this review, we discuss the basic principles behind YOLO, different versions of YOLO, limitations, and YOLO application in agriculture and farming.
- Published
- 2024
- Full Text
- View/download PDF
47. Comparative performance of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN models for detection of multiple weed species
- Author
-
Akhilesh Sharma, Vipan Kumar, and Louis Longchamps
- Subjects
Computer vision ,Image-based weed detection ,Machine learning ,Precision weed management ,Digital agriculture ,YOLO ,Agriculture (General) ,S1-972 ,Agricultural industries ,HD9000-9495 - Abstract
Weeds pose a serious production challenge in various agronomic crops by reducing their grain yields. Increasing cases of herbicide-resistant (HR) weed populations further exacerbate the problem. Future weed control tactics require the integration of non-chemical and reduced chemical-based strategies that can target site- and specie-specific weed management (SSSWM). Advanced machine learning technology has the potential to localize and detect weed seedlings to implement SSSWM. However, due to large biological variability among various weed species and environmental conditions where they grow, accurate and precise weed detection remains challenging. The main objectives of this research were to (1) develop an annotated image database of cocklebur (Xanthium strumarium L.), dandelion (Taraxacum officinale), common waterhemp (Amaranthus tuberculatus), Palmer amaranth (Amaranthus palmeri) and common lambsquarters (Chenopodium album L.), and (2) investigate the comparative performance (speed and accuracy) of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN algorithms in detecting those weed species. A weed dataset with bounding box annotations for each weed species was created, consisting of images collected under variable field conditions which were preprocessed and augmented to create a dataset of 2348 color images. The YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN were trained using the annotated weed image database to detect each weed species. Results indicated that YOLOv11 was the fastest model with inference time of 13.5 milliseconds (ms) followed by YOLOv8 and YOLOv10 with inference time 23 and 19.3 milliseconds (ms), respectively. The YOLOv9 had the highest accuracy in detecting different weed species with an overall mean average precision (mAP@0.5) of 0.935. In contrast, the Detectron2 with Fast R-CNN configuration provided mAP@0.5 of 0.821 with an inference time of 63.8 ms. These results suggest that the YOLO series algorithms have the potential for real-time deployment for weed species detection more accurately and faster than Faster R-CNN in agricultural fields.
- Published
- 2024
- Full Text
- View/download PDF
48. BEEHIVE: A dataset of Apis mellifera images to empower honeybee monitoring researchMendeley Data
- Author
-
Massimiliano Micheli, Giulia Papa, Ilaria Negri, Matteo Lancini, Cristina Nuzzi, and Simone Pasinetti
- Subjects
Entomology ,Precision agriculture ,Computer vision ,Object detection ,YOLO ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Science (General) ,Q1-390 - Abstract
This data article describes the collection process of two sub-datasets comprehending images of Apis mellifera captured inside a commercial beehive (“Frame” sub-dataset, 2057 images) and at the bottom of it (“Bottom” sub-dataset, 1494 images). The data was collected in spring of 2023 (April–May) for the “Frame” sub-dataset, in September 2023 for the “Bottom” sub-dataset. Acquisitions were carried out using an instrumented beehive developed for the purpose of monitoring the colony's health status during long periods of time. The color cameras used were equipped with different lenses accordingly (liquid lenses for the internal one, standard lens of 8 mm focal length) and actuated by an embedded board, alongside red LED strips to illuminate the inside of the beehive. Images captured by the internal camera were mostly out-of-focus, thus a filtering procedure based on the adoption of focus measure operators was developed to keep only the in-focus ones. All images were manually labelled by experts using 2-class bounding boxes annotations representing full visible bees (class “bee”) and blurred or occluded bees according to the sub-dataset (“blurred_bee” or “occluded_bee” class). Annotations are provided in YOLO v8 format. The dataset can be useful for entomology research empowered by computer vision, especially for counting tasks, behavior monitoring, and pest management, since a few occurrences of Varroa destructor mites could be present in the “Frame” sub-dataset.
- Published
- 2024
- Full Text
- View/download PDF
49. Using computer vision to classify, locate and segment fire behavior in UAS-captured images
- Author
-
Brett L. Lawrence and Emerson de Lemmus
- Subjects
YOLO ,Computer vision ,Fire behavior ,Fire detection ,UAS ,Physical geography ,GB3-5030 ,Science - Abstract
The widely adaptable capabilities of artificial intelligence, in particular deep learning and computer vision have led to significant research output regarding flame and smoke detection. The composition of flame and smoke, also described as fire behavior, can be considerably different depending on factors like weather, fuels, and the specific landscape fire is being observed on. The ability to detect definable classes of fire behavior using computer vision has not been explored and could be helpful given it often dictates how firefighters respond to fire situations. To test whether types of fire behavior could be reliably classified, we collected and labeled a unique unmanned aerial system (UAS) image dataset of fire behavior classifications to be trained and validated using You Only Look Once (YOLO) detection models. Our 960 labeled images were sourced from over 21 h of UAS video collected during prescribed fire operations covering a large region of Texas and Louisiana, United States. National Wildfire Coordinating Group (NWCG) fire behavior observations and descriptions served as a reference for determining fire behavior classes during labeling. YOLOv8 models were trained on NWCG Rank 1–3 fire behavior descriptions in grassland, shrubland, forested, and combined fire regimes within our study area. Models were first trained and validated on classifying isolated image objects of fire behavior, and then separately trained to locate and segment fire behavior classifications in UAS images. Models trained to classify isolated image objects of fire behavior consistently performed at a mAP of 0.808 or higher, with combined fire regimes producing the best results (mAP = 0.897). Most segmentation models performed relatively poorly, except for the forest regime model at a box (locate) and mask (segment) mAP of 0.59 and 0.611, respectively. Our results indicate that classifying fire behavior with computer vision is possible in different fire regimes and fuel models, whereas locating and segmenting fire behavior types around background information is relatively difficult. However, it may be a manageable task with enough data, and when models are developed for a specific fire regime. With an increasing number of destructive wildfires and new challenges confronting fire managers, identifying how new technologies can quickly assess wildfire situations can assist wildfire responder awareness. Our conclusion is that levels of abstraction deeper than just detection of smoke or flame are possible using computer vision and could make even more detailed aerial fire monitoring possible using a UAS.
- Published
- 2024
- Full Text
- View/download PDF
50. Enhancing dragline operations supervision through computer vision: real time height measurement of dragline spoil piles dump using YOLO
- Author
-
Piyush Singh, V. M. S. R. Murthy, Dheeraj Kumar, and Simit Raval
- Subjects
Computer vision ,YOLO ,dump pile detection ,open-cast mining ,autonomous tracking ,dragline dump monitoring ,Environmental technology. Sanitary engineering ,TD1-1066 ,Environmental sciences ,GE1-350 ,Risk in industry. Risk management ,HD61 - Abstract
Effective monitoring of spoil pile heights resulting from dragline dumps is critical in mining space management both safely and productively, particularly, during active overburden (OB) removal. This study addresses this concern by replicating stable dump piles in alignment with the dragline balancing diagrams at a dynamic scale, optimizing in-pit volume use. Real-time tracking of dump pile heights ensures efficient dump disposal management, garnering attention in the mining industry. Monitoring dump height and shape during OB disposal near the dragline is vital. It is proposed to employ an experimental setup, consistently dumping specific volume samples from predefined heights at constant velocities. The technique uses You Only Look Once (YOLO) for dump pile height measurements. A benchmark dataset is created, encompassing various dragline dump configurations. YOLO achieves an F1-confidence score of 84.6% and a mean average precision (mAP) value of 99.49% in accurately recognizing dump profiles. To validate its reliability, output is compared with photogrammetry (SFM-MVS and NeRF). Employing 2D computer vision (AI) on simulated video data offers a fast, cost-effective, real-time solution for secure dump pile profile detection and height measurement, enhancing dragline mining efficiency with stable and safe dump heights as per design.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.