1,206 results on '"Aerial images"'
Search Results
2. Estimation of corn crop damage caused by wildlife in UAV images.
- Author
-
Aszkowski, Przemysław, Kraft, Marek, Drapikowski, Pawel, and Pieczyński, Dominik
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *DRONE aircraft , *IMAGE segmentation , *CORN - Abstract
Purpose: This paper proposes a low-cost and low-effort solution for determining the area of corn crops damaged by the wildlife facility utilising field images collected by an unmanned aerial vehicle (UAV). The proposed solution allows for the determination of the percentage of the damaged crops and their location. Methods: The method utilises image segmentation models based on deep convolutional neural networks (e.g., UNet family) and transformers (SegFormer) trained on over 300 hectares of diverse corn fields in western Poland. A range of neural network architectures was tested to select the most accurate final solution. Results: The tests show that despite using only easily accessible RGB data available from inexpensive, consumer-grade UAVs, the method achieves sufficient accuracy to be applied in practical solutions for agriculture-related tasks, as the IoU (Intersection over Union) metric for segmentation of healthy and damaged crop reaches 0.88. Conclusion: The proposed method allows for easy calculation of the total percentage and visualisation of the corn crop damages. The processing code and trained model are shared publicly. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. An enhanced and expanded Toolbox for River Velocimetry using Images from Aircraft (TRiVIA).
- Author
-
Legleiter, Carl J. and Kinzel, Paul J.
- Subjects
PARTICLE image velocimetry ,FLOW velocity ,STREAMFLOW ,VECTOR spaces ,WATER management ,RIVER channels - Abstract
Detailed, accurate information on flow patterns in river channels can improve understanding of habitat conditions, geomorphic processes, and potential hazards to help inform water management. Data describing flow patterns in river channels can be obtained efficiently via image‐based techniques that have become more widely used in recent years as the number of platforms for acquiring images has expanded and the number of algorithms for inferring velocities has grown. Image‐based techniques have been incorporated into various software packages, including the Toolbox for River Velocimetry using Images from Aircraft (TRiVIA). TRiVIA is a freely available, standalone computer program that provides a comprehensive workflow for performing particle image velocimetry (PIV)‐based analyses within a graphical interface. This paper summarizes major enhancements incorporated into the latest release of TRiVIA, version 2.1. For example, a new Tool for Input Parameter Selection (TIPS) provides guidance for specifying key inputs to the PIV algorithm by allowing users to explore relationships between flow velocity, pixel size, output vector spacing, and frame interval. Improved visualization capabilities include the ability to create streamlines and display PIV output on an interactive web map. The program now provides greater flexibility for importing field data in various formats and selecting which observations to use for accuracy assessment. The most substantial additions to TRiVIA 2.1 are the ability to integrate bathymetric information with image‐derived velocity estimates to calculate river discharge and to use images acquired from moving aircraft to efficiently map long segments of large rivers to support habitat assessment, contaminant transport studies, and a range of other applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Building rooftop extraction from aerial imagery using low complexity UNet variant models.
- Author
-
Ramalingam, Avudaiammal, Srivastava, Vandita, George, Sam V, Alagala, Swarnalatha, and Manickam, Martin Leo
- Subjects
- *
ROOFTOP construction , *REMOTE-sensing images , *FEATURE selection , *COMPUTATIONAL complexity , *SPINE - Abstract
Retrieved rooftops from satellite images have enormous applications. The diversity and complexity of the building structures is challenging. This work proposes to extract building rooftops using two low-complexity DL models: UNet-AstPPD and UNetVasyPPD. The UNet-AstPPD model enhances feature selection by incorporating Atrous Spatial Pyramidal Pooling into the UNet's decoder. The UNetVasyPPD integrates a VGG-based backbone in the encoder and Asymmetrical Pyramidal-Pooling into the decoder section of the UNet architecture, exhibiting lesser computational complexity. The outcomes demonstrate that Accuracy and Dice Loss of UNet-AstPPD are better. The proposed models training times are just 25.44 minutes and 29.23 minutes respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. AID-YOLO: An Efficient and Lightweight Network Method for Small Target Detector in Aerial Images.
- Author
-
Li, Yuwen, Zheng, Jiashuo, Li, Shaokun, Wang, Chunxi, Zhang, Zimu, and Zhang, Xiujian
- Subjects
OBJECT recognition (Computer vision) ,COST functions ,FEATURE extraction ,COMPUTER vision ,INFRARED imaging - Abstract
The progress of object detection technology is crucial for obtaining extensive scene information from aerial perspectives based on computer vision. However, aerial image detection presents many challenges, such as large image background sizes, small object sizes, and dense distributions. This research addresses the specific challenges relating to small object detection in aerial images and proposes an improved YOLOv8s-based detector named Aerial Images Detector-YOLO(AID-YOLO). Specifically, this study adopts the General Efficient Layer Aggregation Network (GELAN) from YOLOv9 as a reference and designs a four-branch skip-layer connection and split operation module Re-parameterization-Net with Cross-Stage Partial CSP and Efficient Layer Aggregation Networks (RepNCSPELAN4) to achieve a lightweight network while capturing richer feature information. To fuse multi-scale features and focus more on the target detection regions, a new multi-channel feature extraction module named Convolutional Block Attention Module with Two Convolutions Efficient Layer Aggregation Net-works (C2FCBAM) is designed in the neck part of the network. In addition, to reduce the sensitivity to position bias of small objects, a new function, Normalized Weighted Distance Complete Intersection over Union (NWD-CIoU_Loss) weight adaptive loss function, was designed in this study. We evaluate the proposed AID-YOLO method through ablation experiments and comparisons with other advanced models on the VEDAI (512, 1024) and DOTAv1.0 datasets. The results show that compared to the Yolov8s baseline model, AID-YOLO improves the mAP@0.5 metric by 7.36% on the VEDAI dataset. Simultaneously, the parameters are reduced by 31.7%, achieving a good balance between accuracy and parameter quantity. The Average Precision (AP) for small objects has improved by 8.9% compared to the baseline model (YOLOv8s), making it one of the top performers among all compared models. Furthermore, the FPS metric is also well-suited for real-time detection in aerial image scenarios. The AID-YOLO method also demonstrates excellent performance on infrared images in the VEDAI1024 (IR) dataset, with a 2.9% improvement in the mAP@0.5 metric. We further validate the superior detection and generalization performance of AID-YOLO in multi-modal and multi-task scenarios through comparisons with other methods on different resolution images, SODA-A and the DOTAv1.0 datasets. In summary, the results of this study confirm that the AID-YOLO method significantly improves model detection performance while maintaining a reduced number of parameters, making it applicable to practical engineering tasks in aerial image object detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. DFS-DETR: Detailed-Feature-Sensitive Detector for Small Object Detection in Aerial Images Using Transformer.
- Author
-
Cao, Xinyu, Wang, Hanwei, Wang, Xiong, and Hu, Bin
- Subjects
OBJECT recognition (Computer vision) ,FEATURE extraction ,DEEP learning ,TRANSFORMER models ,ENVIRONMENTAL monitoring - Abstract
Object detection in aerial images plays a crucial role across diverse domains such as agriculture, environmental monitoring, and security. Aerial images present several challenges, including dense small objects, intricate backgrounds, and occlusions, necessitating robust detection algorithms. This paper addresses the critical need for accurate and efficient object detection in aerial images using a Transformer-based approach enhanced with specialized methodologies, termed DFS-DETR. The core framework leverages RT-DETR-R18, integrating the Cross Stage Partial Reparam Dilation-wise Residual Module (CSP-RDRM) to optimize feature extraction. Additionally, the introduction of the Detail-Sensitive Pyramid Network (DSPN) enhances sensitivity to local features, complemented by the Dynamic Scale Sequence Feature-Fusion Module (DSSFFM) for comprehensive multi-scale information integration. Moreover, Multi-Attention Add (MAA) is utilized to refine feature processing, which enhances the model's capacity for understanding and representation by integrating various attention mechanisms. To improve bounding box regression, the model employs MPDIoU with normalized Wasserstein distance, which accelerates convergence. Evaluation across the VisDrone2019, AI-TOD, and NWPU VHR-10 datasets demonstrates significant improvements in the mean average precision (mAP) values: 24.1%, 24.0%, and 65.0%, respectively, surpassing RT-DETR-R18 by 2.3%, 4.8%, and 7.0%, respectively. Furthermore, the proposed method achieves real-time inference speeds. This approach can be deployed on drones to perform real-time ground detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Robust and Discriminative Feature Learning via Mutual Information Maximization for Object Detection in Aerial Images.
- Author
-
Sun, Xu, Yu, Yinhui, and Cheng, Qing
- Subjects
DRONE aircraft ,LEARNING strategies ,WEATHER ,DETECTORS ,NUISANCES - Abstract
Object detection in unmanned aerial vehicle (UAV) aerial images has become increasingly important in military and civil applications. General object detection models are not robust enough against interclass similarity and intraclass variability of small objects, and UAV-specific nuisances such as uncontrolled weather conditions. Unlike previous approaches focusing on high-level semantic information, we report the importance of underlying features to improve detection accuracy and robustness from the information-theoretic perspective. Specifically, we propose a robust and discriminative feature learning approach through mutual information maximization (RD-MIM), which can be integrated into numerous object detection methods for aerial images. Firstly, we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain. Then, we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories. Finally, we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields. We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking (UAVDT) datasets to prove the effectiveness of the proposed method. The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods, achieving relative growth rates of 51.0% and 39.4% in corruption robustness, respectively. Our code is available at (accessed on 2 August 2024). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A direct geolocation method for aerial imaging surveys of invasive plants.
- Author
-
Rodriguez III, R., Jenkins, D. M., Leary, J., and Perroy, R.
- Abstract
A software tool was developed to extract the positions of features identified in individual undistorted images acquired by unmanned aircraft systems (UAS), to support operations to locate and control invasive organisms in sensitive natural habitats. The tool determines a feature's position based on selected pixels and metadata in the image. Accuracy tests were performed using test imagery obtained from different camera altitudes (30, 40 and 50 m above ground level) and orientations (0°, 15° and 30° from the vertical) for direct geolocation of features of interest. As an additional case study, the tool was integrated with a deep neural network (DNN) for simultaneous detection and geolocation of the invasive tree Miconia calvescens in natural landscapes on Hawaii Island. For vertical camera orientations, median horizontal position errors were below 5 m. Images from oblique camera views resulted in larger median errors, approaching 10 m. While numerous approaches have been used to improve locational accuracy of objects identified in aerial imagery, the presented tool can be applied directly to individual images collected with commercial off-the-shelf UAS and flight planning software, making it accessible for applications in natural resource management. Furthermore, the tool does not require accurate ground control points, so it is especially suitable for locating invasive plants in dynamic natural landscapes with extensive forest canopy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Automated detection of sugarcane crop lines from UAV images using deep learning
- Author
-
João Batista Ribeiro, Renato Rodrigues da Silva, Jocival Dantas Dias, Mauricio Cunha Escarpinati, and André Ricardo Backes
- Subjects
Crop line ,Aerial Images ,CNN ,UAV ,Precision agriculture ,Agriculture (General) ,S1-972 ,Information technology ,T58.5-58.64 - Abstract
UAVs (Unmanned Aerial Vehicles) have become increasingly popular in the agricultural sector, promoting and enabling the application of aerial image monitoring in both the scientific and business contexts. Images captured by UAVs are fundamental for precision farming practices. They enable us do a better crop planning, input estimates, early identification and correction of sowing failures, more efficient irrigation systems, among other tasks. Since all these activities deal with low or medium altitude images, automated identification of crop lines plays a crucial role improving these tasks. We address the problem of detecting and segmenting crop lines. We use a Convolutional Neural Network to segment the images, labeling their regions in crop lines or unplanted soil. We also evaluated three traditional semantic networks: U-Net, LinkNet, and PSPNet. We compared each network in four segmentation datasets provided by an expert. We also assessed whether the network’s output requires a post-processing step to improve the segmentation. Results demonstrate the efficiency and feasibility of these networks in the proposed task.
- Published
- 2024
- Full Text
- View/download PDF
10. Classification and Early Detection of Solar Panel Faults with Deep Neural Network Using Aerial and Electroluminescence Images.
- Author
-
Jaybhaye, Sangita, Sirvi, Vishal, Srivastava, Shreyansh, Loya, Vaishnav, Gujarathi, Varun, and Jaybhaye, M. D.
- Subjects
- *
ARTIFICIAL neural networks , *IMAGE recognition (Computer vision) , *SOLAR panels , *ELECTROLUMINESCENCE , *IMAGE analysis , *DEEP learning - Abstract
This paper presents an innovative approach to detect solar panel defects early, leveraging distinct datasets comprising aerial and electroluminescence (EL) images. The decision to employ separate datasets with different models signifies a strategic choice to harness the unique strengths of each imaging modality. Aerial images provide comprehensive surface-level insights, while electroluminescence images offer valuable information on internal defects. By using these datasets with specialized models, the study aims to improve defect detection accuracy and reliability. The research explores the effectiveness of modified deep learning models, including DenseNet121 and MobileNetV3, for analyzing aerial images, and introduces a customized architecture and EfficientNetV2B2 models for electroluminescence image analysis. Results indicate promising accuracies for DenseNet121 (93.75%), MobileNetV3 (93.26%), ELFaultNet (customized architecture) (91.62%), and EfficientNetV2B2 (81.36%). This study's significance lies in its potential to transform solar panel maintenance practices, enabling early defect identification and subsequent optimization of energy production. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Enhancing the ability of convolutional neural networks for remote sensing image segmentation using transformers.
- Author
-
Barr, Mohammad
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *COMPUTER vision , *DEEP learning , *REMOTE sensing - Abstract
The segmentation of remote sensing images has emerged as a compelling undertaking in computer vision owing to its use in the development of several applications. The U-Net style has been extensively utilized in many picture segmentation applications, yielding remarkable achievements. Nevertheless, the U-Net has several constraints in the context of remote sensing picture segmentation, mostly stemming from the limited scope of the convolution kernels. The transformer is a deep learning model specifically developed for sequence-to-sequence translation. It incorporates a self-attention mechanism to efficiently process many inputs, selectively retaining the relevant information and discarding the irrelevant inputs by adjusting the weights. However, it highlights a constraint in the localization capability caused by the absence of fundamental characteristics. This work presents a novel approach called U-Net–transformer, which combines the U-Net and transformer models for the purpose of remote sensing picture segmentation. The suggested solution surpasses individual models, such as U-Net and transformers, by combining and leveraging their characteristics. Initially, the transformer obtains the overall context by encoding tokenized picture patches derived from the feature maps of the convolutional neural network (CNN). Next, the encoded feature maps undergo upsampling through a decoder and are then merged with the high-resolution feature maps of the CNN model. This enables the localization to be more accurate. The transformer serves as an unconventional encoder for segmenting remote sensing images. It enhances the U-Net model by capturing localized spatial data, hence improving the capacity to capture intricate details. The U-Net–transformer, as suggested, has demonstrated exceptional performance in remote sensing picture segmentation across many benchmark datasets. The given findings demonstrated the efficacy of integrating the U-Net and transformer model for the purpose of segmenting remote sensing images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Geological Assessment of Faults in Digitally Processed Aerial Images within Karst Area.
- Author
-
Podolszki, Laszlo, Gizdavec, Nikola, Gašparović, Mateo, and Frangen, Tihomir
- Subjects
GEOGRAPHIC information systems ,THEMATIC maps ,GEOLOGICAL mapping ,TECHNOLOGICAL innovations ,AERIAL photographs - Abstract
The evolution of map development has been shaped by advancing techniques and technologies. Nevertheless, field and remote mapping with cabinet data analysis remains essential in this process. Geological maps are thematic maps that delineate diverse geological features. These maps undergo updates reflecting changes in the mapped area, technological advancements, and the availability of new data. Herein, a geological assessment example focused on enhancing mapped data using digitally processed historical (legacy) aerial images is presented for a case study in the Dinarides karst area in Croatia. The study area of Bribirske Mostine is covered by the Basic Geological Map of Yugoslavia (BGMY) at a 100,000 scale, which was developed during the 1960s. As the BGMY was developed 60+ years ago, one of its segments is further analyzed and discussed, namely, faults. Moreover, applying modern-day technologies and reinterpretation, its data, scale, presentation, and possible areas of improvement are presented. Georeferenced digital historical geological data (legacy), comprising BGMY, archive field maps, and aerial images from 1959 used in BGMY development, are reviewed. Original faults were digitalized and reinterpreted within the geographic information system with the following conclusions: (i) more accurate data (spatial positioning) on faults can be gained by digitally processing aerial photographs taken 60+ years ago with detailed review and analysis; (ii) simultaneously, new data were acquired (additional fault lines were interpreted); (iii) the map scale can be up-scaled to 1:25,000 for the investigated area of Bribirske Mostine; and (iv) a newly developed map for the Bribirske Mostine study area is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. MA-DBFAN: multiple-attention-based dual branch feature aggregation network for aerial image semantic segmentation.
- Author
-
Yue, Haoyu, Yue, Junhong, Guo, Xuejun, Wang, Yizhen, and Jiang, Liancheng
- Abstract
Aerial image semantic segmentation has extensive applications in the fields of land resource management, ecological monitoring, and traffic management. Currently, many convolutional neural networks have been developed, but they do not fully utilize the long-term dependence and multi-scale information in high-resolution images, making it difficult for these models to further enhance their segmentation performance. Therefore, a multiple-attention-based dual branch feature aggregation network is proposed to improve the segmentation accuracy of aerial images. This model includes a contextual feature extraction branch (CFEB), a spatial information extraction branch (SIEB), and a feature aggregation module (FAM). In the CFEB, we designed a SeMask-based dual category attention module to extract semantic category features and utilized the ASPP module to extract multi-scale features, effectively capturing global contextual information with categories and multi-scales. Meanwhile, in the SIEB, a shallow CNN is employed to retain the fine-grained features of images. In the FAM, a dual attention interaction module is designed that includes spatial attention and channel attention, effectively fusing the global contextual and spatial local information extracted by the two branches. Extensive experiments on three freely accessible datasets (the UAVid dataset, the Landcover.ai dataset and the Vaihingen dataset) demonstrate that our method outperforms other state-of-the-art models for aerial images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Soil Moisture Determination by Normalized Difference Index Based on Drone Images Analysis.
- Author
-
Khalkho, Dhiraj, Thakur, Sakshi, and Tripathi, M. P.
- Abstract
In the present study, high resolution drone images were captured and analysed for the determination of soil moisture content by the generation of Normalized Difference Soil Moisture Index (NDSMI). The experimental field comprised of six small blocks of 20 × 20 m, and all samples were collected twice a week from every block from the depth of 5 cm for soil moisture determination using the gravimetric method. Aerial images were taken from an altitude of 30 m from the ground to take a full image of the experimental field. The drone takes composite images comprising of blue, green and red bands of the optical region of the spectrum. There are three different types of soil in agricultural fields: bare soil, soil covered in vegetation, and soil covered in crop waste (Shankar et al. Journal of the Indian Society of Remote Sensing 50(3):435?450, 2022). The temporal variation in the moisture content of the study area with sandy clay loam soil texture was found to be 16–25% during the experimental season based on the gravimetric moisture determination method. Band 1(Blue) and Band 3 (Red) of the visual region were used to generate the Normalized Difference Blue-Red Soil Moisture Index(ND
BR SMI) for representing moisture content of the study area. The NDBR SMI values of the generated index were found to be in between 0.3 and 0.6. Scaling factor was found to be 45.56 to convert any pixel value of the generated NDBR SMI to soil moisture content. The generated NDSMI index values displayed a strong correlation on conversion to soil moisture data after multiplying with the scaling factor as displayed by values of coefficient of determination (R2 ), Nash-Sutcliff Efficiency (ENS ), standard deviation ratio (RSR) and Percent Bias (PBIAS)between the observed and the simulated moisture content. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
15. DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images.
- Author
-
Li, Ya-ling, Feng, Yong, Zhou, Ming-liang, Xiong, Xian-cai, Wang, Yong-heng, and Qiang, Bao-hua
- Subjects
- *
SPINE , *DRONE aircraft , *COMPUTATIONAL complexity - Abstract
Unmanned aerial vehicles are increasingly popular due to their ease of operation, low noise, and portability. However, existing object detection methods perform poorly in detecting small targets in densely arranged, sparsely distributed aerial images. To tackle this issue, we enhanced the general object detection method YOLOv5 and introduced a multi-scale detection method called Detach-Merge Attention YOLO (DMA-YOLO). Specifically, we proposed a Detach-Merge Convolution (DMC) module and embedded it into the backbone network to maximize feature retention. Furthermore, we embedded the Bottleneck Attention Module (BAM) into the detection head to suppress interference from complex background information without significantly increasing computational complexity. To represent and process multi-scale features more effectively, we have integrated an extra detection head and enhanced the neck network into the Bi-directional Feature Pyramid Network (BiFPN) structure. Finally, we adopted the SCYLLA-IoU (SIoU) as a loss function to expedite the convergence rate of our model and enhance the precision of detection results. A series of experiments on the VisDrone2019 and UAVDT datasets have illustrated the effectiveness of DMA-YOLO. Code is available at https://github.com/Yaling-Li/DMA-YOLO. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Multi-granularity attention in attention for person re-identification in aerial images.
- Author
-
Xu, Simin, Luo, Lingkun, Hong, Haichao, Hu, Jilin, Yang, Bin, and Hu, Shiqiang
- Subjects
- *
DRONE aircraft , *ATTENTION , *FEATURE extraction - Abstract
In marrying with Unmanned Aerial Vehicles (UAVs), the person re-identification (re-ID) techniques are further strengthened in terms of mobility. However, the simple hybridization brings unavoidable scale diversity and occlusions caused by the altitude and attitude variations during the flight of UAVs. To harmoniously blend the two techniques, in this research, we argue that the pedestrian should be globally perceived regardless of the scale variation, and the internal occlusions should also be well suppressed. For this purpose, we propose a novel Multi-granularity Attention in Attention (MGAiA) network to satisfy the raised demands for the aerial-based re-ID. Specifically, a novel multi-granularity attention (MGA) module is designed to supply the feature extraction model with a global awareness to explore the discriminative knowledge within scale variations. Subsequently, an Attention in Attention (AiA) mechanism is proposed to generate attention scores for measuring the importance of the different granularity, thereby proactively reducing the negative efforts caused by occlusions. We carry out comprehensive experiments on two large-scale UAV-based datasets including PRAI-1581 and P-DESTRE, as well as the transfer learning from three popular ground-based re-ID datasets CUHK03, Market-1501, and CUHK-SYSU to quantify the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Syn2Real Detection in the Sky: Generation and Adaptation of Synthetic Aerial Ship Images.
- Author
-
Wu, Yaoyuan, Guo, Weijie, Tan, Zhuoyue, Zhao, Yifei, Zhu, Quanxing, Wu, Liaoni, and Guo, Zhiming
- Subjects
OBJECT recognition (Computer vision) ,IMAGE recognition (Computer vision) ,COMPUTER vision ,REMOTE sensing ,SHIPS - Abstract
Object detection in computer vision requires a sufficient amount of training data to produce an accurate and general model. However, aerial images are difficult to acquire, so the collection of aerial image datasets is a priority issue. Building on the existing research on image generation, the goal of this work is to create synthetic aerial image datasets that can be used to solve the problem of insufficient data. We generated three independent datasets for ship detection using engine and generative model. These synthetic datasets are rich in virtual scenes, ship categories, weather conditions, and other features. Moreover, we implemented domain-adaptive algorithms to address the issue of domain shift from synthetic data to real data. To investigate the application of synthetic datasets, we validated the synthetic data using six different object detection algorithms and three existing real-world, ship detection datasets. The experimental results demonstrate that the methods for generating synthetic aerial image datasets can complete the insufficient data in aerial remote sensing. Additionally, domain-adaptive algorithms could further mitigate the discrepancy from synthetic data to real data, highlighting the potential and value of synthetic data in aerial image recognition and comprehension tasks in the real world. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. TFDNet: A triple focus diffusion network for object detection in urban congestion with accurate multi-scale feature fusion and real-time capability
- Author
-
Caoyu Gu, Xiaodong Miao, and Chaojie Zuo
- Subjects
Small object detection ,Multi-scale feature fusion ,Aerial images ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Vehicle detection in congested urban scenes is essential for traffic control and safety management. However, the dense arrangement and occlusion of multi-scale vehicles in such environments present considerable challenges for detection systems. To tackle these challenges, this paper introduces a novel object detection method, dubbed the triple focus diffusion network (TFDNet). Firstly, the gradient convolution is introduced to construct the C2f-EIRM module, replacing the original C2f module, thereby enhancing the network’s capacity to extract edge information. Secondly, by leveraging the concept of the Asymptotic Feature Pyramid Network on the foundation of the Path Aggregation Network, the triple focus diffusion module structure is proposed to improve the network’s ability to fuse multi-scale features. Finally, the SPPF-ELA module employs an Efficient Local Attention mechanism to integrate multi-scale information, thereby significantly reducing the impact of background noise on detection accuracy. Experiments on the VisDrone 2021 dataset reveal that the average detection accuracy of the TFDNet algorithm reached 38.4%, which represents a 6.5% improvement over the original algorithm; similarly, its mAP50:90 performance has increased by 3.7%. Furthermore, on the UAVDT dataset, the TFDNet achieved a 3.3% enhancement in performance compared to the original algorithm. TFDNet, with a processing speed of 55.4 FPS, satisfies the real-time requirements for vehicle detection.
- Published
- 2024
- Full Text
- View/download PDF
19. Enhanced Tiny Object Detection in Aerial Images
- Author
-
Fu, Tianyi, Yang, Benyi, Dong, Hongbin, Deng, Baosong, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Zhang, Chuanlei, editor, and Chen, Wei, editor
- Published
- 2024
- Full Text
- View/download PDF
20. Super Resolution of Aerial Images of Intelligent Aircraft via Multi-scale Residual Attention and Distillation Network
- Author
-
Liu, Bingzan, Yang, Yizhen, Dang, Fangyuan, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Dong, Jian, editor, Zhang, Long, editor, and Cheng, Deqiang, editor
- Published
- 2024
- Full Text
- View/download PDF
21. Drone Journalism: Where the Human Eye Cannot Reach—Narratives and Journalistic Uses
- Author
-
Fernández-Barrero, Ángeles, Davim, J. Paulo, Series Editor, Carou, Diego, editor, and Sartal, Antonio, editor
- Published
- 2024
- Full Text
- View/download PDF
22. Deep Convolutional Encoder–Decoder Models for Road Extraction from Aerial Imagery
- Author
-
Kumar, Ashish, Izharul Hasan Ansari, M., Garg, Amit, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Joshi, Amit, editor, Mahmud, Mufti, editor, Ragel, Roshan G., editor, and Karthik, S., editor
- Published
- 2024
- Full Text
- View/download PDF
23. Enhancing CNN Architecture with Constrained NAS for Boat Detection in Aerial Images
- Author
-
Zerrouk, Ilham, Moumen, Younes, El Habchi, Ali, Khiati, Wassim, Berrich, Jamal, Bouchentouf, Toumi, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Tan, Kay Chen, Series Editor, El Fadil, Hassan, editor, and Zhang, Weicun, editor
- Published
- 2024
- Full Text
- View/download PDF
24. A Shape-Based Quadrangle Detector for Aerial Images
- Author
-
Rao, Chaofan, Li, Wenbo, Xie, Xingxing, Cheng, Gong, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Qingshan, editor, Wang, Hanzi, editor, Ma, Zhanyu, editor, Zheng, Weishi, editor, Zha, Hongbin, editor, Chen, Xilin, editor, Wang, Liang, editor, and Ji, Rongrong, editor
- Published
- 2024
- Full Text
- View/download PDF
25. Unveiling the Influence of Image Super-Resolution on Aerial Scene Classification
- Author
-
Ramzy Ibrahim, Mohamed, Benavente, Robert, Ponsa, Daniel, Lumbreras, Felipe, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Vasconcelos, Verónica, editor, Domingues, Inês, editor, and Paredes, Simão, editor
- Published
- 2024
- Full Text
- View/download PDF
26. Model for Fruit Tree Classification Through Aerial Images
- Author
-
Gómez, Valentina Escobar, Guevara Bernal, Diego Gustavo, López Parra, Javier Francisco, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Tabares, Marta, editor, Vallejo, Paola, editor, Suarez, Biviana, editor, Suarez, Marco, editor, Ruiz, Oscar, editor, and Aguilar, Jose, editor
- Published
- 2024
- Full Text
- View/download PDF
27. EnhancedNet, an End-to-End Network for Dense Disparity Estimation and its Application to Aerial Images
- Author
-
Kang, Junhua, Chen, Lin, and Heipke, Christian
- Published
- 2024
- Full Text
- View/download PDF
28. Analysing the use of OpenAerialMap images for OpenStreetMap edits
- Author
-
Ammar Mandourah and Hartwig H. Hochmair
- Subjects
Crowdsourcing ,aerial images ,open source ,data quality ,cross-linkage ,Mathematical geography. Cartography ,GA1-1776 ,Geodesy ,QB275-343 - Abstract
OpenAerialMap (OAM) is a crowdsourcing platform for uploading, hosting, sharing, displaying, searching, and downloading openly licensed Earth image data from around the world. OAM was launched with the primary goal to facilitate rapid disaster response mapping after natural events, such as floods or hurricanes. Images contributed to OAM can be used as a background layer in OpenStreetMap (OSM) editors to edit map features, such as buildings or facilities, which may have been affected by such events. This study analyzes how the provision of OAM images is associated with changes in underlying editing patterns of OSM features by comparing the number of edited OSM objects between 2 weeks before and 2 weeks after an image has been shared through OAM. The comparison also involves other aspects of OSM editing patterns, including geometry types and OSM primary feature types of edited objects, type of feature editing operations, edits per user, and contribution distribution across continents. Results show that the number of point features added to OSM more than quadrupled within the first 2 weeks after OAM image upload compared to that of 2 weeks before, and that the number of ways added almost doubled. This suggests that the OSM community utilizes provided OAM images for OSM map updates in the respective geographic areas. The study provides a showcase which demonstrates how information from one crowdsourcing platform (OAM) can be used to enhance the data quality of another (OSM) with regards to completeness and timeliness.
- Published
- 2024
- Full Text
- View/download PDF
29. YOLOv5-based Dense Small Target Detection Algorithm for Aerial Images Using DIOU-NMS
- Author
-
Yu Wang, Xiang Zou, Jiantong Shi, and Minhua Liu
- Subjects
object detection ,yolov5 ,diou-nms ,aerial images ,small object detection ,complex backgrounds ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
With the advancement of various aerial platforms, there is an increasing abundance of aerial images captured in various environments. However, the detection of densely packed small objects within complex backgrounds remains a challenge. To address the task of detecting multiple small objects, a multi-object detection algorithm based on distance intersection over union loss non-maximum suppression (DIOU-NMS) integrated with you only look once version 5 (YOLOv5) is proposed. Leveraging the YOLOv5s model as the foundation, the algorithm specifically addresses the detection of abundantly and densely packed targets by incorporating a dedicated small object detection layer within the network architecture, thus effectively enhancing the detection capability for small targets using an additional upsampling operation. Moreover, conventional non-maximum suppression is replaced with DIOU-based non-maximum suppression to alleviate the issue of missed detections caused by target density. Experimental results demonstrate the effectiveness of the proposed method in significantly improving the detection performance of dense small targets in complex backgrounds.
- Published
- 2024
30. Predicting tree species composition using airborne laser scanning and multispectral data in boreal forests
- Author
-
Jaime Candelas Bielza, Lennart Noordermeer, Erik Næsset, Terje Gobakken, Johannes Breidenbach, and Hans Ole Ørka
- Subjects
Aerial images ,Airborne laser scanning ,Dirichlet regression ,Dominant species ,Species proportions ,Sentinel-2 ,Physical geography ,GB3-5030 ,Science - Abstract
Tree species composition is essential information for forest management and remotely sensed (RS) data have proven to be useful for its prediction. In forest management inventories, tree species are commonly interpreted manually from aerial images for each stand, which is time and resource consuming and entails substantial uncertainty. The objective of this study was to evaluate a range of RS data sources comprising airborne laser scanning (ALS) and airborne and satellite-borne multispectral data for model-based prediction of tree species composition. Total volume was predicted using non-linear regression and volume proportions of species were predicted using parametric Dirichlet models. Predicted dominant species was defined as the species with the greatest predicted volume proportion and predicted species-specific volumes were calculated as the product of predicted total volume multiplied by predicted volume proportions. Ground reference data obtained from 1184 sample plots of 250 m2 in eight districts in Norway were used. Combinations of ALS and two multispectral data sources, i.e. aerial images and Sentinel-2 satellite images from different seasons, were compared. The most accurate predictions of tree species composition were obtained by combining ALS and multi-season Sentinel-2 imagery, specifically from summer and fall. Independent validation of predicted species proportions yielded average root mean square differences (RMSD) of 0.15, 0.15 and 0.07 (relative RMSD of 30%, 68% and 128%) and squared Pearson's correlation coefficient (r2) of 0.74, 0.79 and 0.51 for Norway spruce (Picea abies (L.) Karst.), Scots pine (Pinus sylvestris L.) and deciduous species, respectively. The dominant species was predicted with median values of overall accuracy, quantity disagreement and allocation disagreement of 0.90, 0.07 and 0.00, respectively. Predicted species-specific volumes yielded average values of RMSD of 63, 48 and 23 m3/ha (relative RMSD of 39%, 94% and 158%) and r2 of 0.84, 0.60 and 0.53 for spruce, pine and deciduous species, respectively. In one of the districts with independent validation plots of mean size 3700 m2, predictions of the dominant species were compared to results obtained through manual photo-interpretation. The model predictions gave greater accuracy than manual photo-interpretation. This study highlights the utility of RS data for prediction of tree species composition in operational forest inventories, particularly indicating the utility of ALS and multi-season Sentinel-2 imagery.
- Published
- 2024
- Full Text
- View/download PDF
31. Modified YOLOv5 for small target detection in aerial images.
- Author
-
Singh, Inderpreet and Munjal, Geetika
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,VISUAL fields - Abstract
Object detection is an important field in computer vision. Detecting objects in aerial images is an extremely challenging task as the objects can be very small compared to the size of the image, the objects can have any orientation, and depending upon the altitude, the same object can appear in different sizes. YOLOv5 is a recent object detection algorithm that has a good balance of accuracy and speed. This work focuses on enhancing the YOLOv5 object detection algorithm specifically for small target detection. The accuracy on small objects has been improved by adding a new feature fusion layer in the feature pyramid part of YOLOv5 and using compound scaling to increase the input size. The modified YOLOv5 demonstrates a remarkable 11% improvement in mAP 0.5 on the small vehicle class of the DOTA dataset while being 25% smaller in terms of GFLOPS and achieving a 10.52% faster inference time, making it well-suited for real-time applications. Furthermore, the modified YOLOv5 achieves a notable 45.2% mAP 0.5 compared to 31.7% mAP 0.5 of YOLOv5 on the challenging VisDrone dataset. The modified YOLOv5 outperforms many state-of-the-art algorithms in small target detection in aerial images. In addition to performance evaluation, we also present an analysis of object sizes in pixel areas in the VisDrone and DOTA datasets. The proposed modifications demonstrate the potential for significant advancements in small target detection in aerial images and provide valuable insights for further research in this area. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Improved TPH for object detection in aerial images.
- Author
-
Wang, Xiaobin, Zhu, Dekang, Yan, Ye, and Sun, Haohui
- Subjects
- *
OBJECT recognition (Computer vision) , *SPINE , *DEEP learning - Abstract
Deep learning has greatly enhanced the general object detection capability. However, when directly applied to aerial images, performance drops significantly due to: (1) Most objects in aerial images are dense and small; (2) UAV altitude variations cause diverse object scales. In this paper, we improve TPH algorithm for exceptional aerial object detection performance. Specifically, we introduce the SPD module to replace the strided convolution layers and pooling layers. And we improve the TPH backbone and neck networks so that large and small objects can be detected accurately. Experiments on VisDrone2019 and DOTA datasets validate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. AFDet: alignment and focusing for aerial object detection.
- Author
-
Liang, Yecheng, Wang, Shigang, Chen, Meimei, Wei, Jian, and Zhao, Yan
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *CHARACTERISTIC functions - Abstract
Object detection in aerial images has become a focus in recent years due to the expansion of its application fields. Aerial objects have the characteristics of multi-scale, arbitrary angle and dense arrangement, which brings considerable challenges to the task. Based on the efficient one-stage detectors, most methods design modules to adaptively align the feature receptive field with the anchors to improve the accuracy of the bounding box regression and calculate loss according to parameter bias. Current methods are not adequately data-driven in feature optimization and the angle parameter regression faces boundary problems. To handle these problems, in this letter, we propose an Align-Focus detector (AFDet) for aerial image object detection. We introduce the ideology of deformable multi-head self-attention to feature optimization and design a new Align-Focus Module (AFM) which can sensitively focus feature encoding response to the real texture area of the objects. In addition, we apply KF-IOU Loss to solve the boundary problem. We conduct necessary experiments on DOTA and HRSC2016 datasets, and AFDet achieves the highest mAP performance compared to the existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. An adaptive multichannel DeepLabv3 + for semantic segmentation of aerial images using improved Beluga Whale Optimization Algorithm.
- Author
-
Anilkumar, P. and Venugopal, P.
- Subjects
METAHEURISTIC algorithms ,IMAGE segmentation ,DEEP learning ,LAND cover ,ELECTRONIC data processing - Abstract
Semantic segmentation of aerial images plays a pivotal role in extracting detailed information about land cover, infrastructure, and natural features. Traditional single-channel segmentation models struggle to harness the rich information present in multi-channel aerial images, such as multi-spectral or hyperspectral data. While the DeepLabV3 + architecture has shown remarkable success in semantic segmentation tasks by exploiting multi-scale context and atrous convolutions, its performance on aerial images remains suboptimal due to the unique challenges of this domain. With the motive of addressing the difficulties in the existing semantic segmentation techniques for aerial images, the adoption of deep learning techniques is utilized. A multi-objective derived Adaptive Multichannel deeplabv3 + (AMC-Deeplabv3 +) with a new meta-heuristic algorithm called Improved Beluga whale optimization (IBWO) algorithm is proposed in this paper. Here, the hyperparameters of Multichannel deeplabv3 + are optimized by the IBWO algorithm and this model intends to set new benchmarks in the accuracy and contextual understanding of aerial image segmentation by integrating multi-channel data processing techniques and preserving spatial context. The proposed model attains improved accuracies of 98.65% & 98.72% for dataset 1 and 2 respectively and also achieves the dice coefficient of 98.73% & 98.85% respectively, with a computation time of 113.0123 s. The evolutional outcomes of the proposed model show significantly better than the state-of-the-art techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. An Aerial Image Detection Algorithm Based on Improved YOLOv5.
- Author
-
Shan, Dan, Yang, Zhi, Wang, Xiaofeng, Meng, Xiangdong, and Zhang, Guangwei
- Subjects
- *
ALGORITHMS , *SPINE , *SAMPLE size (Statistics) - Abstract
To enhance aerial image detection in complex environments characterized by multiple small targets and mutual occlusion, we propose an aerial target detection algorithm based on an improved version of YOLOv5 in this paper. Firstly, we employ an improved Mosaic algorithm to address redundant boundaries arising from varying image scales and to augment the training sample size, thereby enhancing detection accuracy. Secondly, we integrate the constructed hybrid attention module into the backbone network to enhance the model's capability in extracting pertinent feature information. Subsequently, we incorporate feature fusion layer 7 and P2 fusion into the neck network, leading to a notable enhancement in the model's capability to detect small targets. Finally, we replace the original PAN + FPN network structure with the optimized BiFPN (Bidirectional Feature Pyramid Network) to enable the model to preserve deeper semantic information, thereby enhancing detection capabilities for dense objects. Experimental results indicate a substantial improvement in both the detection accuracy and speed of the enhanced algorithm compared to its original version. It is noteworthy that the enhanced algorithm exhibits a markedly improved detection performance for aerial images, particularly under real-time conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A deep neural network for vehicle detection in aerial images.
- Author
-
Du, Rong and Cheng, Yan
- Abstract
This research paper highlights the significance of vehicle detection in aerial images for surveillance systems, focusing on deep learning methods that outperform traditional approaches. However, the challenge of high computation complexity due to diverse vehicle appearances persists. The motivation behind this study is to highlight the crucial role of vehicle detection in aerial images for surveillance systems, emphasizing the superior performance of deep learning methods compared to traditional approaches. To address this, a lightweight deep neural network-based model is developed, striking a balance between accuracy and efficiency enabling real-time operation. The model is trained and evaluated on a standardized dataset, with extensive experiments demonstrating its ability to achieve accurate vehicle detection with significantly reduced computation costs, offering a practical solution for real-world aerial surveillance scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Evaluation of afforestations for avalanche protection with orthoimages using the random forest algorithm.
- Author
-
Grätz, Tina, Vospernik, Sonja, and Scheidl, Christian
- Subjects
- *
AFFORESTATION , *RANDOM forest algorithms , *AVALANCHES , *LAND cover , *GRID cells - Abstract
Afforestations provide cost-effective and environmentally friendly protection against natural hazards, compared to technical measures. In Austria, more than 3000 afforestation sites for hazard protection covering 9000 ha have been established between 1906 and 2017, mainly for snow avalanche protection. The actual protective effect depends on avalanche predisposing factors and land cover, i.e. whether forest is present. In this study, predisposing factors and land cover classes were identified and analysed in selected afforestation sites. The protective effect of forest was attributed to the presence of forest cover and tree species. Using RGB images with a ground resolution of 20 × 20 cm, nine land cover categories have been distinguished by means of supervised classification with the random forest algorithm. Those land cover categories were classified with an overall accuracy of 0.87–0.98 and Kappa-values, ranging between 0.81 and 0.93. Images were filtered using a 3 pixel by 3 pixel majority filter, which assigns each cell in the output grid the most commonly occurring value in a moving window centred on each grid cell. This filter further increased the overall accuracy by removing noise pixels while preserving the fine elements of the classified grid. Our results indicate a protective effect for about half of the analysed afforestation sites. The dominance of the land use class "Meadow" at most sites with little avalanche protection effect suggests grazing as a limiting factor. The spatial information provided with the described method allows to identify critical areas in terms of avalanche protection even years after the initial afforestation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Deep Learning Models for Waterfowl Detection and Classification in Aerial Images †.
- Author
-
Zhang, Yang, Feng, Yuan, Wang, Shiqi, Tang, Zhicheng, Zhai, Zhenduo, Viegut, Reid, Webb, Lisa, Raedeke, Andrew, and Shang, Yi
- Subjects
- *
SUPERVISED learning , *IMAGE recognition (Computer vision) , *DEEP learning , *WETLAND conservation , *WATERFOWL , *COMPUTER vision - Abstract
Waterfowl populations monitoring is essential for wetland conservation. Lately, deep learning techniques have shown promising advancements in detecting waterfowl in aerial images. In this paper, we present performance evaluation of several popular supervised and semi-supervised deep learning models for waterfowl detection in aerial images using four new image datasets containing 197,642 annotations. The best-performing model, Faster R-CNN, achieved 95.38% accuracy in terms of mAP. Semi-supervised learning models outperformed supervised models when the same amount of labeled data was used for training. Additionally, we present performance evaluation of several deep learning models on waterfowl classifications on aerial images using a new real-bird classification dataset consisting of 6,986 examples and a new decoy classification dataset consisting of about 10,000 examples per category of 20 categories. The best model achieved accuracy of 91.58% on the decoy dataset and 82.88% on the real-bird dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. OrientedDiffDet: Diffusion Model for Oriented Object Detection in Aerial Images.
- Author
-
Wang, Li, Jia, Jiale, and Dai, Hualin
- Subjects
IMAGE denoising ,DISTRIBUTION (Probability theory) ,REMOTE-sensing images ,IMAGE processing ,RANDOM sets ,DETECTORS - Abstract
Object detection is a fundamental task of remote-sensing image processing. Most existing object detection detectors handle regression and classification tasks through learning from a fixed set of learnable anchors or queries. To simplify object candidates, we propose a denoising diffusion process for remote-sensing image object detection, which directly detects objects from a set of random boxes. During the training phase, the horizontal detection boxes are transformed into oriented detection boxes firstly. Then, the model learns to reverse this transformation process by diffusing from the ground truth-oriented box to a random distribution. During the inference phase, the model incrementally refines a set of randomly generated boxes to produce the final output result. Remarkable results have been achieved using our proposed method. For instance, on commonly used object detection datasets such as DOTA, our approach achieves a mean average precision (mAP) of 76.59%. Similarly, on the HRSC2016 dataset, our method achieves a 72.4% mAP. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Road Traffic Monitoring from Aerial Images Using Template Matching and Invariant Features.
- Author
-
Qureshi, Asifa Mehmood, Al Mudawi, Naif, Alonazi, Mohammed, Chelloug, Samia Allaoua, and Park, Jeongmin
- Subjects
TRACKING algorithms ,HOUGH transforms ,AUTOMOBILE size ,MOBILE operating systems ,NOISE control ,GRAYSCALE model ,IMAGE registration ,TRAFFIC monitoring - Abstract
Road traffic monitoring is an imperative topic widely discussed among researchers. Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides. However, aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area. To this end, different models have shown the ability to recognize and track vehicles. However, these methods are not mature enough to produce accurate results in complex road scenes. Therefore, this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts. The extracted frames were converted to grayscale, followed by the application of a georeferencing algorithm to embed coordinate information in to the images. The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system. Next, Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction. After preprocessing, the blob detection algorithm helped detect the vehicles. Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme. Detection was done on the first image of every burst. Then, to track vehicles, the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm. To further improve the tracking accuracy by incorporating motion information, Scale Invariant Feature Transform (SIFT) features have been used to find the best possible match among multiple matches. An accuracy rate of 87% for detection and 80% accuracy for tracking in the A1 Motorway Netherland dataset has been achieved. For the Vehicle Aerial Imaging from Drone (VAID) dataset, an accuracy rate of 86% for detection and 78% accuracy for tracking has been achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Using GAN Methods for Aerial Images Segmentation.
- Author
-
ALTUN GÜVEN, Sara and TOPTAŞ, Buket
- Subjects
IMAGE segmentation ,DEEP learning ,ARTIFICIAL neural networks ,MACHINE learning ,SEMANTICS - Abstract
Object detection and segmentation in aerial images is currently a vibrant and significant field of research. The iSAID dataset has been created for object detection in images captured by aerial vehicles. In this study, image semantic segmentation was performed on the iSAID dataset using Generative Adversarial Networks (GANs). The compared GAN methods are CycleGAN, DCLGAN, SimDCL, and SSimDCL. All methods operate on unpaired images. DCLGAN and SimDCL methods are derived by taking inspiration from the CycleGAN method. In these methods, cost functions and network structures vary. This study thoroughly examines the methods, and their similarities and differences are observed. After semantic segmentation is performed, the results are presented using both visual and measurement metrics. Measurement metrics such as FID, KID, SCOOT, PSNR, FSIM, and SSIM are used. When looking at the metric results, the SSimDCL method ranks first with 132.62071 FID, 0.07825 KID, 0.6406 SCOOT, 0.85973 PSNR, 37.862 FSIM, and 0.82725 SSIM; the SimDCL method shows the second-best performance with 149.82306 FID, 0.10215 KID, 0.60142 SCOOT, 0.85224 PSNR, 37.4747 FSIM, and 0.82429 SSIM. The CycleGAN method, on the other hand, ranks last among the applied methods with results of 202.33857 FID, 0.16795 KID, 0.53218 SCOOT, 0.83408 PSNR, 35.7062 FSIM, and 0.7751 SSIM.Experimental studies show that SSimDCL and SimDCL methods outperform other methods in iSAID image semantic segmentation. CycleGAN method, on the other hand, is observed to be less successful compared to other methods. The aim of this study is to perform automatic semantic segmentation in aerial images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Improved YOLOv8 algorithms for small object detection in aerial imagery
- Author
-
Fei Feng, Yu Hu, Weipeng Li, and Feiyan Yang
- Subjects
Aerial images ,YOLOv8 ,Small target detection ,Attention mechanism ,Multiscale feature fusion ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
In drone aerial target detection tasks, a high proportion of small targets and complex backgrounds often lead to false positives and missed detections, resulting in low detection accuracy. To improve the accuracy of the detection of small targets, this study proposes two improved models based on YOLOv8s, named IMCMD_YOLOv8_small and IMCMD_YOLOv8_large. Each model accommodates different application scenarios. First, the network structure was optimized by removing the backbone P5 layer used to detect large targets and merging the P4, P3, and P2 layers, which are better suited for detecting medium and small targets; P3 and P2 serve as detection heads to focus more on small targets. Subsequently, the coordinate attention mechanism is integrated into the backbone’s C2f, to create a C2f_CA module that enhances the model’ s focus on key information and secures a richer flow of gradient information. Subsequently, a multiscale attention feature fusion module was designed to merge the shallow and deep features. Finally, a Dynamic Head was introduced to unify the perception of scale, space, and tasks, further enhancing the detection capability for small targets. Experimental results on the VisDrone2019 dataset demonstrated that, compared with YOLOv8s, IMCMD_YOLOv8_small achieved improvements of 7.7% and 5.1% in mAP@0.5 and mAP@0.5:0.95, respectively, with a 73.0% reduction in the parameter count. The IMCMD_YOLOv8_large model showed even more significant improvements in these metrics, reaching 10.8% and 7.3%, respectively, with a 47.7% reduction in the parameter count, displaying superior performance in small target detection tasks. The improved models not only enhanced the detection accuracy but also achieved model lightweighting, thereby proving the effectiveness of the improvement strategies and showcasing superior performance compared with other classic models.
- Published
- 2024
- Full Text
- View/download PDF
43. Efficient radiometric triangulation for aerial image consistency across inter and intra variances
- Author
-
Kunbo Liu, Yifan Liao, Kaijun Yang, Ke Xi, Qi Chen, Pengjie Tao, and Tao Ke
- Subjects
Aerial images ,Radiometric normalization ,BRDF effect ,Vignetting effect ,Color inconsistencies ,Physical geography ,GB3-5030 ,Environmental sciences ,GE1-350 - Abstract
In this paper, we present a novel approach to radiometric normalization for aerial images, effectively addressing both intra- and inter-image color and illumination inconsistencies caused by varying sensor technologies, atmospheric conditions, and ground reflection. We introduce a comprehensive mathematical model for radiance transfer, which addresses brightness inhomogeneities resulting from the Bidirectional Reflectance Distribution Function (BRDF) effect and vignetting effect, alongside color inconsistencies due to atmospheric and illumination variations. To enhance the model's efficiency and stability for solutions, we employ a strategy that leverages multiple polynomial models for approximation, significantly simplifying the complexity of rigorous solutions. Our efficient and robust solution leverages advanced radiometric triangulation theories, enabling effective radiometric normalization of aerial images. We validate the efficacy of our method with extensive testing on three large datasets, each containing over two hundred aerial images. The experimental results showcase our method's superiority over state-of-the-art methods, as evidenced by enhanced visual quality and improved quantitative performance in statistical assessments.
- Published
- 2024
- Full Text
- View/download PDF
44. A Multi-Objective Derived Adaptive TransDeepLabv3 Using Electric Fish Optimization Algorithm for Aerial Image Semantic Segmentation
- Author
-
P. Anilkumar, Dimitar Tokmakov, P. Venugopal, Srinivas Koppu, Nevena Mileva, and Anna Bekyarova-Tokmakova
- Subjects
Aerial images ,semantic segmentation ,DeepLabv3 ,transformer network ,multi-objective ,electric fish optimization ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
A multi-objective derived Adaptive TransDeepLabv3 (ATransDeeplabv3) with a meta-heuristic approach Electric Fish optimization (EFO) Algorithm is proposed in this paper. The Deeplabv3 network’s segmentation results are unsatisfactory when it comes to unequal sample categories and a shortage of data, but having higher segmentation accuracy for species with adequate instances. To address the issue of uneven sample categories, designed a Deeplabv3 network model that adds a Transformer encoder based on a multihead self-attention mechanism and EFO Algorithm. An Electric Fish Optimization algorithm is utilized in order to improve the effectiveness of the proposed model, and it is primarily used to maximize the accuracy and dice coefficient of the ATransDeeplabv3 by tuning the hyper parameters like number of epochs, number of steps per epoch and hidden neuron count. The suggested Adaptive TransDeepLabv3 is tested using the public domain dataset named Semantic Segmentation of Aerial Imagery and obtained an improved accuracy, dice-coefficient, Jaccard Index and F1-Score values of 97.82%, 97.95%, 98.24% and 97.95% respectively with a computational time of 119.2891 seconds. Therefore the suggested model’s evolutionary outputs outperform state-of-the-art approaches.
- Published
- 2024
- Full Text
- View/download PDF
45. MLFMNet: A Multilevel Feature Mining Network for Semantic Segmentation on Aerial Images
- Author
-
Xinyu Wei, Lei Rao, Guangyu Fan, and Niansheng Chen
- Subjects
Aerial images ,convolutional neural networks (CNNs) ,semantic segmentation ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Semantic segmentation of aerial images is crucial in various practical applications, encompassing traffic management, search tasks, urban planning, and more. However, due to the unique shooting angles of aerial images, there are significant challenges in accurately segmenting objects, including large variations in object scales, deformations, and unclear features of small targets. To address this, we propose a multilevel feature mining network based on an encoder–decoder architecture called MLFMNet, aimed at excavating and integrating multilevel feature information in aerial images to enhance segmentation accuracy and robustness. MLFMNet leverages skip connections to obtain hierarchical feature representations from the encoder. Subsequently, through learnable fusion module and feature reconstruction module in the proposed decoder, it progressively fuses and reconstructs these features, thereby achieving accurate semantic segmentation. To tackle issues of significant size variations and deformations in objects, we design an irregular pyramid receptive field module embedded at the bottom of the encoder to capture receptive fields from multiple feature vectors, thus further mining abstract features. Moreover, to address the challenge of low segmentation and detection accuracy for small targets, a fine-grained feature mining module is embedded in the bottom of the decoder to capture spatial detail features. Particularly, MLFMNet-B achieves an mIoU of 70.8%, ranking fourth in the official leaderboard of the UAVid test set.
- Published
- 2024
- Full Text
- View/download PDF
46. An Intelligent System for Outfall Detection in UAV Images Using Lightweight Convolutional Vision Transformer Network
- Author
-
Mingxin Yu, Ji Zhang, Lianqing Zhu, Shengjun Liang, Wenshuai Lu, and Xinglong Ji
- Subjects
Aerial images ,deep learning ,outfall detection ,vision transformer ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Unmanned aerial vehicle aerial photography technology has become a crucial tool for detecting outfalls that discharge into rivers and oceans. However, the current retrieval process in aerial images relies heavily on visual interpretation by skilled experts, which is time-consuming and inefficient. To address this issue, we propose a lightweight deep-learning model for detecting outfall objects in aerial images. Specifically, the backbone of our proposed model is a lightweight convolutional vision transformer network, which consists of two novel blocks: separated downsampled self-attention and convolutional feedforward network with a shortcut. These blocks are designed to capture information at different granularities in the feature map and build both local and global representations. The model utilizes a path aggregation feature pyramid network as the neck and a lightweight decoupled network as the head. The experiments demonstrate that our model achieves the highest accuracy of 81.5% while utilizing only 2.47 M parameters and 3.95 GFLOPs. Visualization analysis shows that our model pays more attention to true outfall objects. Additionally, we have developed an intelligent outfall detection system based on the proposed model, and the experimental results show that it performs well in the task of outfall detection.
- Published
- 2024
- Full Text
- View/download PDF
47. Effective Anchor Adaptation and Feature Enhancement Strategies for Tiny Object Detection in Aerial Images
- Author
-
Haoguang Liu, Qiang Tong, Lin Miao, and Xiulei Liu
- Subjects
Deep learning ,aerial images ,tiny object detection ,anchor adaptation ,feature enhancement ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In recent years, research based on anchor-based two-stage detectors has achieved great performance improvements in aerial object detection tasks. However, they still have two significant problems in the detection of tiny objects: 1) The preset fixed anchor is not conducive to assigning positive and negative samples in RPN when dealing with tiny objects, resulting in low-quality samples. 2) When the detector encounters tiny objects lacking structural details, it fails to accurately represent features, causing divergence in object features and hindering network learning. In this work, we propose the Anchor Adaptation and Feature Enhancement Strategies (AFS) to alleviate the above two problems. AFS contains two optimized modules: Anchor Adaption RPN Head (A2RH) and Feature Enhanced Attention Module (FEAM). Specifically, A2RH performs anchor adaptive learning by establishing a new anchor bias learning branch from the feature map, enabling higher-quality positive and negative sample assignments in RPN. FEAM introduces global features and mask attention based on FPN, and presents Gaussian mask supervision for attention to obtain stronger feature representation. Experiments show that our method improves the average precision by 1.8% on the baseline model, and achieves state-of-the-art results on AI-TOD dataset. Moreover, validation on AI-TOD-v2 and VisDrone2019 datasets also confirms the effectiveness of our method. The code will soon be available at https://github.com/gravity-lhg/AFS.
- Published
- 2024
- Full Text
- View/download PDF
48. An Attentive Hough Transform Module for Building Extraction From High Resolution Aerial Imagery
- Author
-
Souad Yahia Berrouiguet, Ehlem Zigh, and Mohammed Djebouri
- Subjects
Aerial images ,AttHT-IHT ,building extraction ,deep learning ,U-Net ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In the era of abundant high-resolution aerial imagery, the automatic extraction of buildings is indispensable for applications like disaster response, environmental monitoring, and urban growth analysis. Deep learning approaches, particularly fully convolutional networks, have exhibited remarkable performance in this challenging task. Nevertheless, the accurate identification and delineation of building boundaries pose persistent challenges hindering further improvements in building extraction precision. To tackle these, we introduce a novel deep learning architecture explicitly designed for building extraction in high-resolution aerial images. Our method addresses the precise identification of building borders by combining both local and global contextual information. We efficiently preserve object boundaries and optimize the representation of straight lines within buildings through the integration of the Attentive Hough Transform and Inverse Hough Transform (AttHT-IHT) module into the U-Net architecture. Extensive experiments on the Potsdam dataset showcase substantial enhancements in building extraction accuracy, with a 97.73% accuracy rating and a 96.42% recall rate. Generalization capability on the WHU satellite dataset I was assessed to validate the adaptability of our proposed method.
- Published
- 2024
- Full Text
- View/download PDF
49. Scene-level buildings damage recognition based on Cross Conv-Transformer
- Author
-
Lingfei Shi, Feng Zhang, Junshi Xia, and Jibo Xie
- Subjects
scene recognition ,damaged buildings ,aerial images ,transformer ,Mathematical geography. Cartography ,GA1-1776 - Abstract
Different to pixel-based and object-based image recognition, a larger perspective based on the scene can improve the efficiency of assessing large-scale building damage. However, the complexity of disaster scenes and the scarcity of datasets are major challenges in identifying building damage. To address these challenges, the Cross Conv-Transformer model is proposed to classify and evaluate the degree of damage to buildings using aerial images taken after earthquake. We employ Conv-Embedding and Conv-Projection to extract features from the images. The integration of convolution and Transformer reduces the computational burden of the model while enhancing its feature extraction capabilities. Furthermore, the two branch Conv-Transformer architecture with global and local attention is designed, allowing each branch to focus on global and local features respectively. The cross-attention fusion module merges feature information from the two branches to enrich classification features. At last, we utilize aerial images captured during the Beichuan and Yushu earthquakes as both the training and test sets to assess the model. The proposed Cross Conv-Transformer model improved classification accuracy by 4.7% and 2.1% compared to the ViT and EfficientNet. The results show that the Cross Conv-Transformer model could significantly reduces misclassification between severely and moderately damaged categories.
- Published
- 2023
- Full Text
- View/download PDF
50. MS-YOLO: integration-based multi-subnets neural network for object detection in aerial images
- Author
-
Cao, Xinyu, Duan, Minglei, Ding, Hongwei, and Yang, Zhijun
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.