1,203 results on '"object segmentation"'
Search Results
2. Unsupervised Moving Object Segmentation with Atmospheric Turbulence
- Author
-
Qin, Dehao, Saha, Ripon Kumar, Chung, Woojeh, Jayasuriya, Suren, Ye, Jinwei, Li, Nianyi, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Learning Camouflaged Object Detection from Noisy Pseudo Label
- Author
-
Zhang, Jin, Zhang, Ruiheng, Shi, Yanjiao, Cao, Zhe, Liu, Nian, Khan, Fahad Shahbaz, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
4. Enhancing learning on uncertain pixels in self-distillation for object segmentation.
- Author
-
Chen, Lei, Cao, Tieyong, Zheng, Yunfei, Wang, Yang, Zhang, Bo, and Yang, Jibin
- Subjects
CONVOLUTIONAL neural networks ,LEARNING ability ,TRANSFORMER models ,KNOWLEDGE transfer ,PIXELS - Abstract
Self-distillation method guides the model learning via transferring knowledge of the model itself, which has shown the advantages in object segmentation. However, it has been proved that uncertain pixels with predicted probability close to 0.5 will restrict the model performance. The existing self-distillation methods cannot guide the model to enhance its learning ability for uncertain pixels, so the improvement is limited. To boost the student model's learning ability for uncertain pixels, a novel self-distillation method is proposed. Firstly, the predicted probability in the current training sample and the ground truth label are fused to construct the teacher knowledge, as the current predicted information can express the performance of student models and represent the uncertainty of pixels more accurately. Secondly, a quadratic mapping function between the predicted probabilities of the teacher and student model is proposed. Theoretical analysis shows that the proposed method using the mapping function can guide the model to enhance the learning ability for uncertain pixels. Finally, the essential difference of utilizing the predicted probability of the student model in self-distillation is discussed in detail. Extensive experiments were conducted on models with convolutional neural networks and Transformer architectures as the backbone networks. The results on four public datasets demonstrate that the proposed method can effectively improve the student model performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Comparative Analysis of Nucleus Segmentation Techniques for Enhanced DNA Quantification in Propidium Iodide-Stained Samples.
- Author
-
Jónás, Viktor Zoltán, Paulik, Róbert, Molnár, Béla, and Kozlovszky, Miklós
- Subjects
FLOW cytometry ,FLUORIMETRY ,IMAGE analysis ,DATA mining ,IMAGE processing - Abstract
Digitization in pathology and cytology labs is now widespread, a significant shift from a decade ago when few doctors used image processing tools. Despite unchanged scanning times due to excitation in fluorescent imaging, advancements in computing power and software have enabled more complex algorithms, yielding better-quality results. This study evaluates three nucleus segmentation algorithms for ploidy analysis using propidium iodide-stained digital WSI slides. Our goal was to improve segmentation accuracy to more closely match DNA histograms obtained via flow cytometry, with the ultimate aim of enhancing the calibration method we proposed in a previous study, which seeks to align image cytometry results with those from flow cytometry. We assessed these algorithms based on raw segmentation performance and DNA histogram similarity, using confusion-matrix-based metrics. Results indicate that modern algorithms perform better, with F1 scores exceeding 0.845, compared to our earlier solution's 0.807, and produce DNA histograms that more closely resemble those from the reference FCM method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Enhancing oil palm segmentation model with GAN-based augmentation.
- Author
-
Kwong, Qi Bin, Kon, Yee Thung, Rusik, Wan Rusydiah W., Shabudin, Mohd Nor Azizi, Rahman, Shahirah Shazana A., Kulaveerasingam, Harikrishna, and Appleton, David Ross
- Subjects
TRANSFORMER models ,DATA augmentation ,OIL palm ,GENERATIVE adversarial networks ,TILES ,PALMS - Abstract
In digital agriculture, accurate crop detection is fundamental to developing automated systems for efficient plantation management. For oil palm, the main challenge lies in developing robust models that perform well in different environmental conditions. This study addresses the feasibility of using GAN augmentation methods to improve palm detection models. For this purpose, drone images of young palms (< 5 year-old) from eight different estates were collected, annotated, and used to build a baseline detection model based on DETR. StyleGAN2 was trained on the extracted palms and then used to generate a series of synthetic palms, which were then inserted into tiles representing different environments. CycleGAN networks were trained for bidirectional translation between synthetic and real tiles, subsequently utilized to augment the authenticity of synthetic tiles. Both synthetic and real tiles were used to train the GAN-based detection model. The baseline model achieved precision and recall values of 95.8% and 97.2%. The GAN-based model achieved comparable result, with precision and recall values of 98.5% and 98.6%. In the challenge dataset 1 consisting older palms (> 5 year-old), both models also achieved similar accuracies, with baseline model achieving precision and recall of 93.1% and 99.4%, and GAN-based model achieving 95.7% and 99.4%. As for the challenge dataset 2 consisting of storm affected palms, the baseline model achieved precision of 100% but recall was only 13%. The GAN-based model achieved a significantly better result, with a precision and recall values of 98.7% and 95.3%. This result demonstrates that images generated by GANs have the potential to enhance the accuracies of palm detection models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery.
- Author
-
Ersoz, Ahmet Bahaddin, Pekcan, Onur, and Akbas, Emre
- Subjects
- *
MACHINE learning , *BUILDING sites , *OBJECT recognition (Computer vision) , *CONSTRUCTION equipment , *CONSTRUCTION projects , *DEEP learning , *PIXELS - Abstract
Applying deep learning algorithms in the construction industry holds tremendous potential for enhancing site management, safety, and efficiency. The development of such algorithms necessitates a comprehensive and diverse image dataset. This study introduces the Aerial Image Dataset for Construction (AIDCON), a novel aerial image collection containing 9563 construction machines across nine categories annotated at the pixel level, carrying critical value for researchers and professionals seeking to develop and refine object detection and segmentation algorithms across various construction projects. The study highlights the benefits of utilizing UAV-captured images by evaluating the performance of five cutting-edge deep learning algorithms—Mask R-CNN, Cascade Mask R-CNN, Mask Scoring R-CNN, Hybrid Task Cascade, and Pointrend—on the AIDCON dataset. It underscores the significance of clustering strategies for generating reliable and robust outcomes. The AIDCON dataset's unique aerial perspective aids in reducing occlusions and provides comprehensive site overviews, facilitating better object positioning and segmentation. The findings presented in this paper have far-reaching implications for the construction industry, as they enhance construction site efficiency while setting the stage for future advancements in construction site monitoring and management utilizing remote sensing technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Enhancing oil palm segmentation model with GAN-based augmentation
- Author
-
Qi Bin Kwong, Yee Thung Kon, Wan Rusydiah W. Rusik, Mohd Nor Azizi Shabudin, Shahirah Shazana A. Rahman, Harikrishna Kulaveerasingam, and David Ross Appleton
- Subjects
Oil palm segmentation ,GAN ,Object detection ,Object segmentation ,Data augmentation ,Vision transformer ,Computer engineering. Computer hardware ,TK7885-7895 ,Information technology ,T58.5-58.64 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract In digital agriculture, accurate crop detection is fundamental to developing automated systems for efficient plantation management. For oil palm, the main challenge lies in developing robust models that perform well in different environmental conditions. This study addresses the feasibility of using GAN augmentation methods to improve palm detection models. For this purpose, drone images of young palms ( 5 year-old), both models also achieved similar accuracies, with baseline model achieving precision and recall of 93.1% and 99.4%, and GAN-based model achieving 95.7% and 99.4%. As for the challenge dataset 2 consisting of storm affected palms, the baseline model achieved precision of 100% but recall was only 13%. The GAN-based model achieved a significantly better result, with a precision and recall values of 98.7% and 95.3%. This result demonstrates that images generated by GANs have the potential to enhance the accuracies of palm detection models.
- Published
- 2024
- Full Text
- View/download PDF
9. Hardness-aware loss for object segmentation
- Author
-
Lei Chen, Tieyong Cao, Yunfei Zheng, Yang Wang, Bo Zhang, and Jibin Yang
- Subjects
Loss function ,Object segmentation ,Hardness value ,Uncertainty ,Epoch influence ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
In object segmentation, the existence of hard-classified-pixels limits the segmentation performance. Focusing on these hard pixels through assigning different weights to pixel loss can guide the learning of segmentation model effectively. Existing loss weight assignment methods perceive pixels hardness by current predicted information, pay less attention to past predicted information. While current studies show that samples with less improvement in predicted probability compared to the past are difficult to learn. To define hard pixels more accurately, a hardness-aware loss for object segmentation is proposed. Firstly, the metric of pixel hardness degree is defined, and a mapping function is proposed to quantitatively evaluate the hardness degree which is defined on the difference between current and past predicted probabilities. Then a new compound metric, hardness value, is defined based on hardness degree and the uncertainty. Based on the compound metric, a new loss function is proposed. Experiment results on four datasets using convolutional neural network and Transformer as the backbone models demonstrate that the proposed method effectively improves the accuracy of object segmentation. Especially, in the segmentation model based on ResNet-50, the proposed method improves mean Intersection over Union (mIoU) by almost 4.3 % compared to cross entropy on DUT-O dataset.
- Published
- 2024
- Full Text
- View/download PDF
10. Image processing framework for in-process shaft diameter measurement on legacy manual machines.
- Author
-
Choudhari, Sahil J., Singh, Swarit Anand, Kumar, Aitha Sudheer, and Desai, Kaushal A.
- Abstract
In-process dimension measurement is critical to achieving higher productivity and realizing smart manufacturing goals during machining operations. Vision-based systems have significant potential to serve for in-process dimensions measurements, reduce human interventions, and achieve manufacturing-inspection integration. This paper presents early research on developing a vision-based system for in-process dimension measurement of machined cylindrical components utilizing image-processing techniques. The challenges with in-process dimension measurement are addressed by combining a deep learning-based object detection model, You Only Look Once version 2 (YOLOv2), and image processing algorithms for object localization, segmentation, and spatial pixel estimation. An automated image pixel calibration approach is incorporated to improve algorithm robustness. The image acquisition hardware and the real-time image processing framework are integrated to demonstrate the working of the proposed system by considering a case study of in-process stepped shaft diameter measurement. The system implementation on a manual lathe demonstrated robust utilities, eliminating the need for manual intermittent measurements, digitized in-process component dimensions, and improved machining productivity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Nested object detection using mask R-CNN: application to bee and varroa detection.
- Author
-
Kriouile, Yassine, Ancourt, Corinne, Wegrzyn-Wolska, Katarzyna, and Bougueroua, Lamine
- Abstract
In this paper, we address an essential problem related to object detection and image processing: detecting objects potentially nested in other ones. This problem exists particularly in the beekeeping sector: detecting varroa parasites on bees. Indeed, beekeepers must ensure the level of infestation of their apiaries by the varroa parasite which settles on the backs of bees. As far as we know, there is no yet a published approach to deal with nested object detection using only one neural network trained on two different datasets. We propose an approach that fills this gap. Therefore, we improve the accuracy and the efficiency of bee and varroa detection task. Our work is based on deep learning, more precisely Mask R-CNN neural network. Instead of segmenting detected objects (bees), we segment internal objects (varroas). We add a branch to Faster R-CNN to segment internal objects. We extract relevant features for internal object segmentation and suggest efficient method for training the neural network on two different datasets. Our experiments are based on a set of images of bee frames, containing annotated bees and varroa mites. Due to differences in occurrence rates, two different sets were created. After carrying out experiments, we ended up with a single neural network capable of detecting two nested objects without decreasing accuracy compared to two separate neural networks. Our approach, compared to traditional separate neural networks, improves varroa detection accuracy by 1.9%, reduces infestation level prediction error by 0.22%, and reduces execution time by 28% and model memory by 23%. In our approach, we extract Res4 (a layer of the ResNet neural network) features for varroa segmentation, which improves detection accuracy by 11% compared to standard FPN extraction. Thus, we suggest a new approach that detects nested objects more accurately than two separate network approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Enhancing learning on uncertain pixels in self-distillation for object segmentation
- Author
-
Lei Chen, Tieyong Cao, Yunfei Zheng, Yang Wang, Bo Zhang, and Jibin Yang
- Subjects
Self-distillation ,Object segmentation ,Uncertain pixel ,Current prediction ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Abstract Self-distillation method guides the model learning via transferring knowledge of the model itself, which has shown the advantages in object segmentation. However, it has been proved that uncertain pixels with predicted probability close to 0.5 will restrict the model performance. The existing self-distillation methods cannot guide the model to enhance its learning ability for uncertain pixels, so the improvement is limited. To boost the student model’s learning ability for uncertain pixels, a novel self-distillation method is proposed. Firstly, the predicted probability in the current training sample and the ground truth label are fused to construct the teacher knowledge, as the current predicted information can express the performance of student models and represent the uncertainty of pixels more accurately. Secondly, a quadratic mapping function between the predicted probabilities of the teacher and student model is proposed. Theoretical analysis shows that the proposed method using the mapping function can guide the model to enhance the learning ability for uncertain pixels. Finally, the essential difference of utilizing the predicted probability of the student model in self-distillation is discussed in detail. Extensive experiments were conducted on models with convolutional neural networks and Transformer architectures as the backbone networks. The results on four public datasets demonstrate that the proposed method can effectively improve the student model performance.
- Published
- 2024
- Full Text
- View/download PDF
13. Improved organs at risk segmentation based on modified U‐Net with self‐attention and consistency regularisation.
- Author
-
Manko, Maksym, Popov, Anton, Gorriz, Juan Manuel, and Ramirez, Javier
- Subjects
CHEST (Anatomy) ,ARTIFICIAL neural networks ,COMPUTED tomography ,RETINAL blood vessels ,IMAGE segmentation ,HEART ,ESOPHAGUS - Abstract
Cancer is one of the leading causes of death in the world, with radiotherapy as one of the treatment options. Radiotherapy planning starts with delineating the affected area from healthy organs, called organs at risk (OAR). A new approach to automatic OAR segmentation in the chest cavity in Computed Tomography (CT) images is presented. The proposed approach is based on the modified U‐Net architecture with the ResNet‐34 encoder, which is the baseline adopted in this work. The new two‐branch CS‐SA U‐Net architecture is proposed, which consists of two parallel U‐Net models in which self‐attention blocks with cosine similarity as query‐key similarity function (CS‐SA) blocks are inserted between the encoder and decoder, which enabled the use of consistency regularisation. The proposed solution demonstrates state‐of‐the‐art performance for the problem of OAR segmentation in CT images on the publicly available SegTHOR benchmark dataset in terms of a Dice coefficient (oesophagus—0.8714, heart—0.9516, trachea—0.9286, aorta—0.9510) and Hausdorff distance (oesophagus—0.2541, heart—0.1514, trachea—0.1722, aorta—0.1114) and significantly outperforms the baseline. The current approach is demonstrated to be viable for improving the quality of OAR segmentation for radiotherapy planning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. 基于双分支在线优化和特征融合的 视频目标跟踪算法.
- Author
-
李新鹏, 王 鹏, 李晓艳, 孙梦宇, 陈遵田, and 郜 辉
- Subjects
TRACKING algorithms ,ELECTRONIC equipment ,TEST reliability ,RELIABILITY in engineering ,RESEARCH institutes - Abstract
Copyright of Chinese Journal of Liquid Crystal & Displays is the property of Chinese Journal of Liquid Crystal & Displays and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
15. 基于方向编码与空洞采样的室内点云物体分割.
- Author
-
李彭, 陈西江, 赵不钒, 宣伟, and 邓辉
- Abstract
Copyright of Journal of Computer-Aided Design & Computer Graphics / Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao is the property of Gai Kan Bian Wei Hui and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
16. Detection of the farmland plow areas using RGB-D images with an improved YOLOv5 model.
- Author
-
Jiangtao Ji, Zhihao Han, Kaixuan Zhao, Qianwen Li, and Shucan Du
- Subjects
- *
AGRICULTURAL equipment , *CONTOURS (Cartography) , *VISUAL fields , *COMPUTATIONAL complexity , *FARM tractors , *CAMERAS - Abstract
Recognition of the boundaries of farmland plow areas has an important guiding role in the operation of intelligent agricultural equipment. To precisely recognize these boundaries, a detection method for unmanned tractor plow areas based on RGB-Depth (RGB-D) cameras was proposed, and the feasibility of the detection method was analyzed. This method applied advanced computer vision technology to the field of agricultural automation. Adopting and improving the YOLOv5-seg object segmentation algorithm, first, the Convolutional Block Attention Module (CBAM) was integrated into Concentrated-Comprehensive Convolution Block (C3) to form C3CBAM, thereby enhancing the ability of the network to extract features from plow areas. The GhostConv module was also utilized to reduce parameter and computational complexity. Second, using the depth image information provided by the RGB-D camera combined with the results recognized by the YOLOv5-seg model, the mask image was processed to extract contour boundaries, align the contours with the depth map, and obtain the boundary distance information of the plowed area. Last, based on farmland information, the calculated average boundary distance was corrected, further improving the accuracy of the distance measurements. The experiment results showed that the YOLOv5-seg object segmentation algorithm achieved a recognition accuracy of 99% for plowed areas and that the ranging accuracy improved with decreasing detection distance. The ranging error at 5.5 m was approximately 0.056 m, and the average detection time per frame is 29 ms, which can meet the real-time operational requirements. The results of this study can provide precise guarantees for the autonomous operation of unmanned plowing units. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. FIRE DETECTION USING SURVEILLANCE SYSTEMS.
- Author
-
Mahmoud, Hanan Samir
- Subjects
TELEVISION in security systems ,IMAGE segmentation ,FIRE prevention ,IMAGE processing ,VIDEO surveillance - Abstract
This research aims at presenting a video-based system to detect fire in real time taking advantage of already existing surveillance systems for fire detection either inside or outside the building with different illumination and short or long distance surveillance scenes. Detection of fires with surveillance cameras is characterized by early detection and rapid performance. Information about the progress of the fire can be obtained through live video. Also vision-based system is capable of providing forensic evidence. The basic idea of the research is fire detection based on video as proposed Fourier descriptors were used to describe reddish moving objects. The proposed system idea is to detect reddish moving bodies in every frame and correlate the detections with the same reddish bodiest over time. Multi-threshold segmentation is used to divide the image. This method can be integrated with pretreatment and post-processing. The threshold is one of the most common ways to divide the image. The next stage after the segmentation is to obtain the reddish body features. The feature is created by obtaining the contour of the reddish body and estimating its normalized Fourier descriptors. If the reddish body contour's Fourier descriptors vary from frame to frame, one can predict fire. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Convex Segments for Convex Objects Using DNN Boundary Tracing and Graduated Optimization
- Author
-
Pal, Jimut B., Awate, Suyash P., Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
- Published
- 2024
- Full Text
- View/download PDF
19. Loci-Segmented: Improving Scene Segmentation Learning
- Author
-
Traub, Manuel, Becker, Frederic, Sauter, Adrian, Otte, Sebsastian, Butz, Martin V., Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Wand, Michael, editor, Malinovská, Kristína, editor, Schmidhuber, Jürgen, editor, and Tetko, Igor V., editor
- Published
- 2024
- Full Text
- View/download PDF
20. Investigating Neural Networks and Transformer Models for Enhanced Comic Decoding
- Author
-
Kouletou, Eleanna, Papavassiliou, Vassilis, Katsouros, Vassilis, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mouchère, Harold, editor, and Zhu, Anna, editor
- Published
- 2024
- Full Text
- View/download PDF
21. Study of the patterns of variations in ice lakes and the factors influencing these changes on the southeastern Tibetan plateau
- Author
-
Y.U. Mingwei, L.I. Feng, G.U.O. Yonggang, S.U. Libin, and Q.I.N. Deshun
- Subjects
GEE ,Ice lake change ,Object segmentation ,Southeast tibet ,Climate change ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
The ice lakes in the southeastern Qinghai–Tibet Plateau have exhibited a pronounced expansion against the backdrop of global warming, consequently amplifying the local risk of ice lake outburst disasters. However, surveys of ice lake changes in the entire region have consistently been incomplete due to the prevalent high cloud density. On the basis of Landsat remote sensing images and the Google Earth Engine (GEE) cloud computing platform, in this study, the full convolution segmentation algorithm is utilized to accurately and comprehensively map the regional distribution of ice lakes in southeastern Tibet at consistent time intervals in 1993, 2008, and 2023. Furthermore, the formation, distribution, and dynamic changes in these ice lakes are investigated. The numbers of ice lakes discovered in 1993, 2008, and 2023 were 2520, 3198, and 3877, respectively. These lakes covered areas of approximately 337.64 ± 36.86 km2, 363.92 ± 40.90 km2, and 395.74 ± 22.72 km2, respectively. These ice lakes are located primarily between altitudes of 4442 m and 4909 m. The total area experienced an annual growth rate of approximately 0.57 % from 1993 to 2023. In the present study, the long-term variations in ice lakes in each district and county are examined. These findings indicate that between 1993 and 2023, the expansion of ice lakes was more pronounced in regions with a large number of marine glaciers. Notably, Basu County presented the highest annual growth rate of the ice lake population, at 6.23 %, followed by Bomi County, at 4.28 %, and finally, Zayul County, at 2.94 %. The accelerated shrinkage of marine glaciers induced by global warming is the primary driver behind the expansion of ice lakes. The results obtained from this research will enhance our overall understanding of the complex dynamics and mechanisms that govern the formation of ice lakes while also offering valuable perspectives on the potential risks linked to their expansion in this particular area.
- Published
- 2024
- Full Text
- View/download PDF
22. The Segmentation Tracker With Mask-Guided Background Suppression Strategy
- Author
-
Erlin Tian, Yunpeng Lei, Junfeng Sun, Keyan Zhou, Bin Zhou, and Hanfei Li
- Subjects
Object tracking ,Siamese network ,object segmentation ,background interference ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Segmentation-based tracking is currently a promising tracking paradigm with pixel-wise information. However, the lack of structural constraints makes it difficult to maintain excellent performance in the presence of background interference. Therefore, we propose a Segmentation tracker with mask-guided background suppression strategy. Firstly, a mask-aware module is designed to generate more accurate target masks. With the guidance of regression loss, features were selected that are sensitive only to the target region among shallow features that contain more spatial information. Structural information is introduced and background clutter in the backbone feature is suppressed, which enhances the reliability of the target segmentation. Secondly, a mask-guided template suppression module is constructed to improve feature representation. The generated mask with clear target contours can be used to filter the background noise, which increases the distinction between foreground and background of which. Therefore, the module highlights the target area and improves the interference resistance of the template. Finally, an adaptive spatiotemporal context constraint strategy is proposed to aid the target location. The strategy learns a region probability matrix by the object mask of the previous frame, which is used to constrain the contextual information in the search region of the current frame. Benefiting from this strategy, our method effectively suppresses similar distractors in the search region and achieves robust tracking. Broad experiments on five challenge benchmarks including VOT2016, VOT2018, VOT2019, OTB100, and TC128 indicate that the proposed tracker performs stably under complex tracking backgrounds.
- Published
- 2024
- Full Text
- View/download PDF
23. Infrared Ship Segmentation Based on Weakly-Supervised and Semi-Supervised Learning
- Author
-
Isa Ali Ibrahim, Abdallah Namoun, Sami Ullah, Hisham Alasmary, Muhammad Waqas, and Iftekhar Ahmad
- Subjects
Infrared ship images ,object segmentation ,weakly-supervised learning ,semi-supervised learning ,pixel-level pseudo-labels ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Existing fully-supervised semantic segmentation methods have achieved good performance. However, they all rely on high-quality pixel-level labels. To minimize the annotation costs, weakly-supervised methods or semi-supervised methods are proposed. When such methods are applied to the infrared ship image segmentation, inaccurate object localization occurs, leading to poor segmentation results. In this paper, we propose an infrared ship segmentation (ISS) method based on weakly-supervised and semi-supervised learning, aiming to improve the performance of ISS by combining the advantages of two learning methods. It uses only image-level labels and a minimal number of pixel-level labels to segment different classes of infrared ships. Our proposed method includes three steps. First, we designed a dual-branch localization network based on ResNet50 to generate ship localization maps. Second, we trained a saliency network with minimal pixel-level labels and many localization maps to obtain ship saliency maps. Then, we optimized the saliency maps with conditional random fields and combined them with image-level labels to generate pixel-level pseudo-labels. Finally, we trained the segmentation network with these pixel-level pseudo-labels to obtain the final segmentation results. Experimental results on the infrared ship dataset collected on real sites indicate that the proposed method achieves 71.18% mean intersection over union, which is at most 56.72% and 8.75% higher than the state-of-the-art weakly-supervised and semi-supervised methods, respectively.
- Published
- 2024
- Full Text
- View/download PDF
24. YOLO-Based Tree Trunk Types Multispectral Perception: A Two-Genus Study at Stand-Level for Forestry Inventory Management Purposes
- Author
-
Daniel Queiros da Silva, Filipe Neves Dos Santos, Vitor Filipe, Armando Jorge Sousa, and E. J. Solteiro Pires
- Subjects
Deep learning ,forest inventory ,multispectral imaging ,object detection ,object segmentation ,tree trunk types ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Stand-level forest tree species perception and identification are needed for monitoring-related operations, being crucial for better biodiversity and inventory management in forested areas. This paper contributes to this knowledge domain by researching tree trunk types multispectral perception at stand-level. YOLOv5 and YOLOv8 - Convolutional Neural Networks specialized at object detection and segmentation - were trained to detect and segment two tree trunk genus (pine and eucalyptus) using datasets collected in a forest region in Portugal. The dataset comprises only two categories, which correspond to the two tree genus. The datasets were manually annotated for object detection and segmentation with RGB and RGB-NIR images, and are publicly available. The “Small” variant of YOLOv8 was the best model at detection and segmentation tasks, achieving an F1 measure above 87% and 62%, respectively. The findings of this study suggest that the use of extended spectra, including Visible and Near Infrared, produces superior results. The trained models can be integrated into forest tractors and robots to monitor forest genus across different spectra. This can assist forest managers in controlling their forest stands.
- Published
- 2024
- Full Text
- View/download PDF
25. Synchronizing Object Detection: Applications, Advancements and Existing Challenges
- Author
-
Md. Tanzib Hosain, Asif Zaman, Mushfiqur Rahman Abir, Shanjida Akter, Sawon Mursalin, and Shadman Sakeeb Khan
- Subjects
Object detection ,image recognition ,object segmentation ,semantic detection ,image classification ,object tracking ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
From pivotal roles in autonomous vehicles, healthcare diagnostics, and surveillance systems to seamlessly integrating with augmented reality, object detection algorithms stand as the cornerstone in unraveling the complexities of the visual world. Tracing the trajectory from conventional region-based methods to the latest neural network architectures reveals a technological renaissance where algorithms metamorphose into digital artisans. However, this journey is not without hurdles, prompting researchers to grapple with real-time detection, robustness in varied environments, and interpretability amidst the intricacies of deep learning. The allure of addressing issues such as occlusions, scale variations, and fine-grained categorization propels exploration into uncharted territories, beckoning the scholarly community to contribute to an ongoing saga of innovation and discovery. This research offers a comprehensive panorama, encapsulating the applications reshaping our digital reality, the advancements pushing the boundaries of perception, and the open issues extending an invitation to the next generation of visionaries to explore uncharted frontiers within object detection.
- Published
- 2024
- Full Text
- View/download PDF
26. Comparative Analysis of Nucleus Segmentation Techniques for Enhanced DNA Quantification in Propidium Iodide-Stained Samples
- Author
-
Viktor Zoltán Jónás, Róbert Paulik, Béla Molnár, and Miklós Kozlovszky
- Subjects
digital pathology ,cytometry ,image analysis ,object segmentation ,fluorescence ,ploidy ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Digitization in pathology and cytology labs is now widespread, a significant shift from a decade ago when few doctors used image processing tools. Despite unchanged scanning times due to excitation in fluorescent imaging, advancements in computing power and software have enabled more complex algorithms, yielding better-quality results. This study evaluates three nucleus segmentation algorithms for ploidy analysis using propidium iodide-stained digital WSI slides. Our goal was to improve segmentation accuracy to more closely match DNA histograms obtained via flow cytometry, with the ultimate aim of enhancing the calibration method we proposed in a previous study, which seeks to align image cytometry results with those from flow cytometry. We assessed these algorithms based on raw segmentation performance and DNA histogram similarity, using confusion-matrix-based metrics. Results indicate that modern algorithms perform better, with F1 scores exceeding 0.845, compared to our earlier solution’s 0.807, and produce DNA histograms that more closely resemble those from the reference FCM method.
- Published
- 2024
- Full Text
- View/download PDF
27. AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery
- Author
-
Ahmet Bahaddin Ersoz, Onur Pekcan, and Emre Akbas
- Subjects
construction machinery ,image dataset ,unmanned aerial vehicle ,deep learning ,object segmentation ,Science - Abstract
Applying deep learning algorithms in the construction industry holds tremendous potential for enhancing site management, safety, and efficiency. The development of such algorithms necessitates a comprehensive and diverse image dataset. This study introduces the Aerial Image Dataset for Construction (AIDCON), a novel aerial image collection containing 9563 construction machines across nine categories annotated at the pixel level, carrying critical value for researchers and professionals seeking to develop and refine object detection and segmentation algorithms across various construction projects. The study highlights the benefits of utilizing UAV-captured images by evaluating the performance of five cutting-edge deep learning algorithms—Mask R-CNN, Cascade Mask R-CNN, Mask Scoring R-CNN, Hybrid Task Cascade, and Pointrend—on the AIDCON dataset. It underscores the significance of clustering strategies for generating reliable and robust outcomes. The AIDCON dataset’s unique aerial perspective aids in reducing occlusions and provides comprehensive site overviews, facilitating better object positioning and segmentation. The findings presented in this paper have far-reaching implications for the construction industry, as they enhance construction site efficiency while setting the stage for future advancements in construction site monitoring and management utilizing remote sensing technologies.
- Published
- 2024
- Full Text
- View/download PDF
28. A Novel Multi-Data-Augmentation and Multi-Deep-Learning Framework for Counting Small Vehicles and Crowds.
- Author
-
Tsai, Chun-Ming and Shih, Frank Y.
- Subjects
- *
TRAFFIC monitoring , *DRONE aircraft , *DATA augmentation , *CROWDS , *COUNTING , *TELECOMMUNICATION systems - Abstract
Counting small pixel-sized vehicles and crowds in unmanned aerial vehicles (UAV) images is crucial across diverse fields, including geographic information collection, traffic monitoring, item delivery, communication network relay stations, as well as target segmentation, detection, and tracking. This task poses significant challenges due to factors such as varying view angles, non-fixed drone cameras, small object sizes, changing illumination, object occlusion, and image jitter. In this paper, we introduce a novel multi-data-augmentation and multi-deep-learning framework designed for counting small vehicles and crowds in UAV images. The framework harnesses the strengths of specific deep-learning detection models, coupled with the convolutional block attention module and data augmentation techniques. Additionally, we present a new method for detecting cars, motorcycles, and persons with small pixel sizes. Our proposed method undergoes evaluation on the test dataset v2 of the 2022 AI Cup competition, where we secured the first place on the private leaderboard by achieving the highest harmonic mean. Subsequent experimental results demonstrate that our framework outperforms the existing YOLOv7-E6E model. We also conducted comparative experiments using the publicly available VisDrone datasets, and the results show that our model outperforms the other models with the highest AP50 score of 52%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision.
- Author
-
Manakitsa, Nikoleta, Maraslidis, George S., Moysis, Lazaros, and Fragulis, George F.
- Subjects
COMPUTER vision ,OBJECT recognition (Computer vision) ,DEEP learning ,HUMAN activity recognition ,MACHINE learning ,IMAGE recognition (Computer vision) - Abstract
Machine vision, an interdisciplinary field that aims to replicate human visual perception in computers, has experienced rapid progress and significant contributions. This paper traces the origins of machine vision, from early image processing algorithms to its convergence with computer science, mathematics, and robotics, resulting in a distinct branch of artificial intelligence. The integration of machine learning techniques, particularly deep learning, has driven its growth and adoption in everyday devices. This study focuses on the objectives of computer vision systems: replicating human visual capabilities including recognition, comprehension, and interpretation. Notably, image classification, object detection, and image segmentation are crucial tasks requiring robust mathematical foundations. Despite the advancements, challenges persist, such as clarifying terminology related to artificial intelligence, machine learning, and deep learning. Precise definitions and interpretations are vital for establishing a solid research foundation. The evolution of machine vision reflects an ambitious journey to emulate human visual perception. Interdisciplinary collaboration and the integration of deep learning techniques have propelled remarkable advancements in emulating human behavior and perception. Through this research, the field of machine vision continues to shape the future of computer systems and artificial intelligence applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Siamese refine polar mask prediction network for visual tracking.
- Author
-
Pu, Bin, Xiang, Ke, Liu, Ze'an, and Wang, Xuanyin
- Abstract
Visual tracking is a classical research problem and recently tracking with mask prediction has been a popular task in tracking research. Many trackers add a pixel-wise segmentation subnetwork behind the original bounding box tracker to get the target's mask. These two-stage methods need to crop the target region after finding its location and extract deep features for segmentation redundantly. This paper proposes an anchor-free Siamese Refine Polar Mask (SiamRPM) prediction network for visual tracking, which can obtain the target's mask directly. Similar to bounding box regression, we use polar mask regression to get the target's convex hull mask. To further adjust the contour points, we propose to employ a cascaded refinement module. The mask contours are iteratively shifted using the offset outputs of the refinement module. Comprehensive experiments on visual tracking benchmark datasets illustrate that our SiamRPM can achieve competitive results with a real-time running speed. Our method provides an effective contour-based pipeline for the tracking and segmentation task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Edge-assisted Object Segmentation Using Multimodal Feature Aggregation and Learning.
- Author
-
JIANBO LI, GENJI YUAN, and ZHENG YANG
- Subjects
IMAGE fusion - Abstract
Object segmentation aims to perfectly identify objects embedded in the surrounding environment and has a wide range of applications. Most previous methods of object segmentation only use RGB images and ignore geometric information from disparity images. Making full use of heterogeneous data from different devices has proved to be a very effective strategy for improving segmentation performance. The key challenge of the multimodal fusion-based object segmentation task lies in the learning, transformation, and fusion of multimodal information. In this article, we focus on the transformation of disparity images and the fusion of multimodal features. We develop a multimodal fusion object segmentation framework, termed the Hybrid Fusion Segmentation Network (HFSNet). Specifically, HFSNet contains three key components, i.e., disparity convolutional sparse coding (DCSC), asymmetric dense projection feature aggregation (ADPFA), and multimodal feature fusion (MFF). The DCSC is designed based on convolutional sparse coding. It not only has better interpretability but also preserves the key geometric information of the object. ADPFA is designed to enhance texture and geometric information to fully exploit nonadjacent features. MFF is used to perform multimodal feature fusion. Extensive experiments show that our HFSNet outperforms existing state-of-the-art models on two challenging datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Orchard monitoring based on unmanned aerial vehicles and image processing by artificial neural networks: a systematic review.
- Author
-
Popescu, Dan, Ichim, Loretta, and Stoican, Florin
- Subjects
ARTIFICIAL neural networks ,DRONE aircraft ,IMAGE processing ,ORCHARDS ,FRUIT quality ,TREE diseases & pests ,DIGITAL image processing - Abstract
Orchard monitoring is a vital direction of scientific research and practical application for increasing fruit production in ecological conditions. Recently, due to the development of technology and the decrease in equipment cost, the use of unmanned aerial vehicles and artificial intelligence algorithms for image acquisition and processing has achieved tremendous progress in orchards monitoring. This paper highlights the new research trends in orchard monitoring, emphasizing neural networks, unmanned aerial vehicles (UAVs), and various concrete applications. For this purpose, papers on complex topics obtained by combining keywords from the field addressed were selected and analyzed. In particular, the review considered papers on the interval 2017-2022 on the use of neural networks (as an important exponent of artificial intelligence in image processing and understanding) and UAVs in orchard monitoring and production evaluation applications. Due to their complexity, the characteristics of UAV trajectories and flights in the orchard area were highlighted. The structure and implementations of the latest neural network systems used in such applications, the databases, the software, and the obtained performances are systematically analyzed. To recommend some suggestions for researchers and end users, the use of the new concepts and their implementations were surveyed in concrete applications, such as a) identification and segmentation of orchards, trees, and crowns; b) detection of tree diseases, harmful insects, and pests; c) evaluation of fruit production, and d) evaluation of development conditions. To show the necessity of this review, in the end, a comparison is made with review articles with a related theme. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Unraveling False Positives in Unsupervised Defect Detection Models: A Study on Anomaly-Free Training Datasets.
- Author
-
Qiu, Ji, Shi, Hongmei, Hu, Yuhen, and Yu, Zujun
- Subjects
- *
FALSE alarms , *INTRUSION detection systems (Computer security) , *INDUSTRIAL applications , *INSPECTION & review - Abstract
Unsupervised defect detection methods have garnered substantial attention in industrial defect detection owing to their capacity to circumvent complex fault sample collection. However, these models grapple with establishing a robust boundary between normal and abnormal conditions in intricate scenarios, leading to a heightened frequency of false-positive predictions. Spurious alerts exacerbate the work of reconfirmation and impede the widespread adoption of unsupervised anomaly detection models in industrial applications. To this end, we delve into the sole available data source in unsupervised defect detection models, the unsupervised training dataset, to introduce a solution called the False Alarm Identification (FAI) method aimed at learning the distribution of potential false alarms using anomaly-free images. It exploits a multi-layer perceptron to capture the semantic information of potential false alarms from a detector trained on anomaly-free training images at the object level. During the testing phase, the FAI model operates as a post-processing module applied after the baseline detection algorithm. The FAI algorithm determines whether each positive patch predicted by the normalizing flow algorithm is a false alarm by its semantic features. When a positive prediction is identified as a false alarm, the corresponding pixel-wise predictions are set to negative. The effectiveness of the FAI method is demonstrated by two state-of-the-art normalizing flow algorithms on extensive industrial applications. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Sequential interactive image segmentation.
- Author
-
Lin, Zheng, Zhang, Zhao, Zhu, Zi-Yue, Fan, Deng-Ping, and Liu, Xia-Lei
- Subjects
ANNOTATIONS - Abstract
Interactive image segmentation (IIS) is an important technique for obtaining pixel-level annotations. In many cases, target objects share similar semantics. However, IIS methods neglect this connection and in particular the cues provided by representations of previously segmented objects, previous user interaction, and previous prediction masks, which can all provide suitable priors for the current annotation. In this paper, we formulate a sequential interactive image segmentation (SIIS) task for minimizing user interaction when segmenting sequences of related images, and we provide a practical approach to this task using two pertinent designs. The first is a novel interaction mode. When annotating a new sample, our method can automatically propose an initial click proposal based on previous annotation. This dramatically helps to reduce the interaction burden on the user. The second is an online optimization strategy, with the goal of providing semantic information when annotating specific targets, optimizing the model with dense supervision from previously labeled samples. Experiments demonstrate the effectiveness of regarding SIIS as a particular task, and our methods for addressing it. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Accurate drone corner position estimation in complex backgrounds with boundary classification
- Author
-
Yu-Shiuan Tsai, Cheng-Sheng Lin, and Guan-Yi Li
- Subjects
Boundary classification ,Deep learning ,Object segmentation ,Channel frame detection ,YOLACT ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
This study develops an efficient approach for precise channel frame detection in complex backgrounds, addressing the critical need for accurate drone navigation. Leveraging YOLACT and group regression, our method outperforms conventional techniques that rely solely on color information. We conducted extensive experiments involving channel frames placed at various angles and within intricate backgrounds, training the algorithm to effectively recognize them. The process involves initial edge image detection, noise reduction through binarization and erosion, segmentation of channel frame line segments using the Hough Transform algorithm, and subsequent classification via the K-means algorithm. Ultimately, we obtain the regression line segment through linear regression, enabling precise positioning by identifying intersection points. Experimental validations validate the robustness of our approach across diverse angles and challenging backgrounds, making significant advancements in UAV applications.
- Published
- 2024
- Full Text
- View/download PDF
36. Visual Object Segmentation Improvement Using Deep Convolutional Neural Networks
- Author
-
Kanithan, S., Vignesh, N. Arun, Karthick SA, Hamdan, Allam, Editorial Board Member, Al Madhoun, Wesam, Editorial Board Member, Alareeni, Bahaaeddin, Editor-in-Chief, Baalousha, Mohammed, Editorial Board Member, Elgedawy, Islam, Editorial Board Member, Hussainey, Khaled, Editorial Board Member, Eleyan, Derar, Editorial Board Member, Hamdan, Reem, Editorial Board Member, Salem, Mohammed, Editorial Board Member, Jallouli, Rim, Editorial Board Member, Assaidi, Abdelouahid, Editorial Board Member, Nawi, Noorshella Binti Che, Editorial Board Member, AL-Kayid, Kholoud, Editorial Board Member, Wolf, Martin, Editorial Board Member, El Khoury, Rim, Editorial Board Member, Kumar, Ashish, editor, Jain, Rachna, editor, Vairamani, Ajantha Devi, editor, and Nayyar, Anand, editor
- Published
- 2023
- Full Text
- View/download PDF
37. Human Activity Recognition in Video Sequences Based on the Integration of Optical Flow and Appearance of Human Objects
- Author
-
Kushwaha, Arati, Khare, Ashish, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Muthusamy, Hariharan, editor, Botzheim, János, editor, and Nayak, Richi, editor
- Published
- 2023
- Full Text
- View/download PDF
38. A Novel Autoencoder for Task-Driven Object Segmentation
- Author
-
Jiang, Weijie, Cai, Yuxiang, Yu, Yuanlong, Chen, Rong, Zhang, Jianglong, Zheng, Weitao, Su, Renjie, Wu, Xi, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Fuchun, editor, Li, Jianmin, editor, Liu, Huaping, editor, and Chu, Zhongyi, editor
- Published
- 2023
- Full Text
- View/download PDF
39. DeepTemplates: Object Segmentation Using Shape Templates
- Author
-
Maheshwari, Nikhar, Ramola, Gaurav, Velusamy, Sudha, Kini, Raviprasad Mohan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gupta, Deep, editor, Bhurchandi, Kishor, editor, Murala, Subrahmanyam, editor, Raman, Balasubramanian, editor, and Kumar, Sanjeev, editor
- Published
- 2023
- Full Text
- View/download PDF
40. Sequential interactive image segmentation
- Author
-
Zheng Lin, Zhao Zhang, Zi-Yue Zhu, Deng-Ping Fan, and Xia-Lei Liu
- Subjects
interactive segmentation ,user interaction ,object segmentation ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract Interactive image segmentation (IIS) is an important technique for obtaining pixel-level annotations. In many cases, target objects share similar semantics. However, IIS methods neglect this connection and in particular the cues provided by representations of previously segmented objects, previous user interaction, and previous prediction masks, which can all provide suitable priors for the current annotation. In this paper, we formulate a sequential interactive image segmentation (SIIS) task for minimizing user interaction when segmenting sequences of related images, and we provide a practical approach to this task using two pertinent designs. The first is a novel interaction mode. When annotating a new sample, our method can automatically propose an initial click proposal based on previous annotation. This dramatically helps to reduce the interaction burden on the user. The second is an online optimization strategy, with the goal of providing semantic information when annotating specific targets, optimizing the model with dense supervision from previously labeled samples. Experiments demonstrate the effectiveness of regarding SIIS as a particular task, and our methods for addressing it.
- Published
- 2023
- Full Text
- View/download PDF
41. Automated Infield Grapevine Inflorescence Segmentation Based on Deep Learning Models †.
- Author
-
Moreira, Germano, Magalhães, Sandro Augusto, dos Santos, Filipe Neves, and Cunha, Mário
- Subjects
- *
DEEP learning , *GRAPES , *INFLORESCENCES , *PRODUCTION scheduling , *VITICULTURE , *COMPUTER vision - Abstract
Yield forecasting is of immeasurable value in modern viticulture to optimize harvest scheduling and quality management. Traditionally, this task is carried out through manual and destructive sampling of production components and their accurate assessment is expensive, time-consuming, and error-prone, resulting in erroneous projections. The number of inflorescences and flowers per vine is one of the main components and serves as an early predictor. The adoption of new non-invasive technologies can automate this task and drive viticulture yield forecasting to higher levels of accuracy. In this study, different Single Stage Instance Segmentation models from the state-of-the-art You Only Look Once (YOLO) family, such as YOLOv5 and YOLOv8, were benchmarked on a dataset of RGB images for grapevine inflorescence detection and segmentation, with the aim of validating and subsequently implementing the solution for counting the number of inflorescences and flowers. All models obtained promising results, with the YOLOv8s and the YOLOv5s models standing out with an F1-Score of 95.1% and 97.7% for the detection and segmentation tasks, respectively. Moreover, the low inference times obtained demonstrate the models' ability to be deployed in real-time applications, allowing for non-destructive predictions in uncontrolled environments. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Multi-Scale Indoor Scene Geometry Modeling Algorithm Based on Segmentation Results.
- Author
-
Wang, Changfa, Yao, Tuo, and Yang, Qinghua
- Subjects
POINT cloud ,ALGORITHMS ,GEOMETRIC analysis ,SURFACE structure ,CLASSIFICATION algorithms ,GEOMETRIC modeling - Abstract
Due to the numerous objects with regular structures in indoor environments, identifying and modeling the regular objects in scenes aids indoor robots in sensing unknown environments. Typically, point cloud preprocessing can obtain highly complete object segmentation results in scenes which can be utilized as the objects for geometric analysis and modeling, thus ensuring modeling accuracy and speed. However, due to the lack of a complete object model, it is not possible to recognize and model segmented objects through matching methods. To achieve a greater understanding of scene point clouds, this paper proposes a direct geometric modeling algorithm based on segmentation results, which focuses on extracting regular geometries in the scene, rather than objects with geometric details or combinations of multiple primitives. This paper suggests using simpler geometric models to describe the corresponding point cloud data. By fully utilizing the surface structure information of segmented objects, the paper analyzes the types of faces and their relationships to classify regular geometric objects into two categories: planar and curved. Different types of geometric objects are fitted using random sampling consistency algorithms with type classification results as prior knowledge, and segmented results are modeled through a combination of size information associated with directed bounding boxes. For indoor scenes with occlusion and stacking, utilizing a higher-level semantic expression can effectively simplify the scene, complete scene abstraction and structural modeling, and aid indoor robots' understanding and further operation in unknown environments. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. An Image Quality Improvement Method in Side-Scan Sonar Based on Deconvolution.
- Author
-
Liu, Jia, Pang, Yan, Yan, Lengleng, and Zhu, Hanhao
- Subjects
- *
DECONVOLUTION (Mathematics) , *SONAR , *ACOUSTIC imaging , *SIGNAL-to-noise ratio , *PROBLEM solving , *IMAGE compression , *OCEAN bottom - Abstract
Side-scan sonar (SSS) is an important underwater imaging method that has high resolutions and is convenient to use. However, due to the restriction of conventional pulse compression technology, the side-scan sonar beam sidelobe in the range direction is relatively high, which affects the definition and contrast of images. When working in a shallow-water environment, image quality is especially influenced by strong bottom reverberation or other targets on the seabed. To solve this problem, a method for image-quality improvement based on deconvolution is proposed herein. In this method, to increase the range resolution and lower the sidelobe, a deconvolution algorithm is employed to improve the conventional pulse compression. In our simulation, the tolerance of the algorithm to different signal-to-noise ratios (SNRs) and the resolution ability of multi-target conditions were analyzed. Furthermore, the proposed method was applied to actual underwater data. The experimental results showed that the quality of underwater acoustic imaging could be effectively improved. The ratios of improvement for the SNR and contrast ratio (CR) were 32 and 12.5%, respectively. The target segmentation results based on this method are also shown. The accuracy of segmentation was effectively improved. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Fine segmentation and difference-aware shape adjustment for category-level 6DoF object pose estimation.
- Author
-
Liu, Chongpei, Sun, Wei, Liu, Jian, Zhang, Xing, Fan, Shimeng, and Fu, Qiang
- Subjects
SINGLE-degree-of-freedom systems ,AUGMENTED reality ,SUBTRACTION (Mathematics) - Abstract
Six-degree-of-freedom (6DoF) object pose estimation is a critical task for robot manipulation, autonomous vehicles, and augmented reality. Category-level 6DoF object pose estimation is trending because it can generalize to same-category unknown objects. However, existing mean shape based methods do not consider that predicting adjustment must model shape differences, which makes these methods still suffer from shape variations among same-category objects, limiting their accuracy. Also, existing methods overlook the importance of object segmentation to 6DoF pose estimation and use an RGB-based object segmentation method with low accuracy. To address these problems, we propose difference-aware shape adjustment and RGB-D feature fusion-based object segmentation for category-level 6DoF object pose estimation. The proposed method encodes shape differences, improving mean shape adjustment and alleviating same-category shape variations. Specifically, a difference-aware shape adjustment network (DASAN) is proposed to model shape differences between the object instance and mean shape by feature subtraction with an attention mechanism. We also propose an RGB-D feature fusion-based object segmentation method that uses a coarse-to-fine framework: a 2D detector and a novel RGB-D feature fusion-based binary classification network for coarse and fine segmentation. Experiments on two well-known datasets demonstrate the proposed method's state-of-the-art (SOTA) pose estimation accuracy. In addition, we construct comparative experiments on the latest dataset (Wild6D) and a self-collected dataset (OBJECTS) and achieve high accuracies, demonstrating the strong generalizability of the proposed method. Also, we apply the proposed method to unknown object grasping, thus demonstrating the practicability of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Self-distillation object segmentation via pyramid knowledge representation and transfer.
- Author
-
Zheng, Yunfei, Sun, Meng, Wang, Xiaobing, Cao, Tieyong, Zhang, Xiongwei, Xing, Lixing, and Fang, Zheng
- Subjects
- *
KNOWLEDGE representation (Information theory) , *KNOWLEDGE transfer , *PYRAMIDS , *SOURCE code , *NETWORK performance - Abstract
The self-distillation methods can transfer the knowledge within the network itself to enhance the generalization ability of the network. However, due to the lack of spatially refined knowledge representations, current self-distillation methods can hardly be directly applied to object segmentation tasks. In this paper, we propose a novel self-distillation framework via pyramid knowledge representation and transfer for the object segmentation task. Firstly, a lightweight inference network is built to perform pixel-wise prediction rapidly. Secondly, a novel self-distillation method is proposed. To derive refined pixel-wise knowledge representations, the auxiliary self-distillation network via multi-level pyramid representation branches is built and appended to the inference network. A synergy distillation loss, which utilizes the top-down and consistency knowledge transfer paths, is presented to force more discriminative knowledge to be distilled into the inference network. Consequently, the performance of the inference network is improved. Experimental results on five datasets of object segmentation demonstrate that the proposed self-distillation method helps our inference network perform better segmentation effectiveness and efficiency than nine recent object segmentation network. Furthermore, the proposed self-distillation method outperforms typical self-distillation methods. The source code is publicly available at https://github.com/xfflyer/SKDforSegmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Automatic robot Manoeuvres detection using computer vision and deep learning techniques: a perspective of internet of robotics things (IoRT).
- Author
-
Mahajan, Hemant B., Uke, Nilesh, Pise, Priya, Shahade, Makarand, Dixit, Vandana G., Bhavsar, Swapna, and Deshpande, Sarita D.
- Subjects
ROBOTIC path planning ,INTERNET of things ,DEEP learning ,COMPUTER vision ,ROBOT motion ,ROBOTS ,LONG short-term memory - Abstract
To minimize any impediments in real-time Internet of Things (IoT)-enabled robotics applications, this study demonstrated how to build and deploy a revolutionary framework using computer vision and deep learning. In contrast to robotic path planning algorithms based on geolocation. We focus on sensor-captured streams/images and geographical information to enable the Internet of Robotic Things (IoRT) to evolve. The application will collect real-time data from moving robotics at various situations and intervals and use it for research projects. The data collected in videos/image forms are delivered in the robotics application using visual sensor nodes. In this study, anticipating moving robot moves automatically early on can aid in issuing commands to monitor and regulate robots' future activities before they occur. To do so, we propose the framework using efficient computer vision techniques and a deep learning classifier. The computer vision methods are designed for frame quality improvement, object segmentation, and feature estimation. The Long-Term Short Memory (LSTM) classifier detects robot motions automatically from initial sequential features. We mainly designed the proposed model using an LSTM classifier to perform the earlier prediction from the initial sequential features of partial video frames and to overcome the problems of exploding and vanishing gradients. LSTM helps to reduce the prediction duration with higher accuracy. It also enables the central system of a certain robotic application to prevent collisions caused by impediments in the interior or outdoor situation. The simulation results utilizing publicly available research datasets demonstrate the proposed model's efficiency and robustness compared to state-of-the-art approaches. The overall accuracy of the proposed model has improved approximately by 5% and reduced computational complexity by 84% approximately. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Object detection and segmentation by composition of fast fuzzy C-mean clustering based maps.
- Author
-
Nawaz, Mehmood, Qureshi, Rizwan, Teevno, Mansoor Ali, and Shahid, Ali Raza
- Abstract
The extraction of salient objects from a cluttered background without any prior knowledge is a challenging task in salient object detection and segmentation. A salient object can be detected from the uniqueness, rarity, or unproductivity of the salient regions in an image. However, an object with a similar color appearance may have a marginal visual divergence that is even difficult for the human eyes to recognize. In this paper, we propose a technique which compose and fuse the fast fuzzy c-mean (FFCM) clustering saliency maps to separate the salient object from the background in the image. To be specific, we first generate the maps using FFCM clustering, that contain specific parts of the salient region, which are composed later by using the Porter–Duff composition method. Outliers in the extracted salient regions are removed using a morphological technique in the post-processing step. To extract the final map from the initially constructed blended maps, we use a fused mask, which is the composite form of color prior, location prior, and frequency prior. Experiment results on six public data sets (MSRA, THUR-15000, MSRA-10K, HKU-IS, DUT-OMRON, and SED) clearly show the efficiency of the proposed method for images with a noisy background. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Recent Advances in Intelligent Processing of Satellite Video: Challenges, Methods, and Applications
- Author
-
Shengyang Li, Xian Sun, Yanfeng Gu, Yixuan Lv, Manqi Zhao, Zhuang Zhou, Weilong Guo, Yuhan Sun, Han Wang, and Jian Yang
- Subjects
Deep learning (DL) ,object detection ,object segmentation ,object tracking ,scene classification ,super-resolution ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Intelligent processing of satellite video focuses on extracting specific information of ground objects and scenes from earth observation videos through intelligent image/video processing technology, which has important applications in fields such as traffic monitoring, resource monitoring, and environmental monitoring. The integration of deep learning technology in satellite video processing has led to significant advancements in tasks such as object detection and object tracking, expanding into emerging research areas such as satellite video scene classification and object segmentation. However, there is no comprehensive review and summary in the intelligent processing of satellite video. This article presents a systematic review and quantitative analysis of the results published over the last decade, intending to further promote the development of various intelligent processing tasks for satellite video. It analyzes the current difficulties, challenges, and the methodological system for each task. In addition, it provides an in-depth analysis and summary of publicly available datasets and evaluation benchmarks for each task, as well as classic algorithm performance and application scenarios. Finally, this article summarizes the current research status and looks forward to the future development trend, hoping to inspire researchers in related fields and jointly promote the development of intelligent processing of satellite video.
- Published
- 2023
- Full Text
- View/download PDF
49. Tools development to optimize the use of micro-drones for architectural cultural heritage survey
- Author
-
Andrea Tomalini, Edoardo Pristeri, and Jacopo Bono
- Subjects
uas ,vpl ,autopilot photogrammetric survey ,mask r-cnn ,object segmentation ,Architecture ,NA1-9428 ,Architectural drawing and design ,NA2695-2793 - Abstract
In view of the increasingly widespread use of inoffensive UAS for photogrammetric acquisitions in the architectural and infrastructural spheres, there is a need to be able to program flight missions suited to the operator’s needs. This contribution presents the results of two experiments conducted by the research group. The first proposed procedure, based on low-cost instrumentation and algorithms in a VPL environment, fills the gap of proprietary applications and allows the coding and customisation of flight missions for photogrammetry. Obtaining this information is not always easy; immovable or unforeseen obstacles lead to lengthy post-production of the photogrammetric cloud to remove them. The second procedure, by constructing an object segmentation framework, fills this gap by automatically processing photogrammetric images by recreating masks that remove unwanted objects from the dense cloud calculation. Despite some shortcomings, the results are promising and manage to make up for these shortcomings, at least in part.DOI: https://doi.org/10.20365/disegnarecon.29.2022.16
- Published
- 2022
50. Border Ownership, Category Selectivity and Beyond
- Author
-
Chen, Tianlong, Cheng, Xuemei, Tsao, Thomas, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Li, Bo, editor, Yao, Angela, editor, Liu, Yang, editor, Duan, Ye, editor, Lau, Manfred, editor, Khadka, Rajiv, editor, Crisan, Ana, editor, and Chang, Remco, editor
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.