Back to Search
Start Over
Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention.
- Source :
-
Sensors (14248220) . Jul2024, Vol. 24 Issue 13, p4098. 20p. - Publication Year :
- 2024
-
Abstract
- In complex environments a single visible image is not good enough to perceive the environment, this paper proposes a novel dual-stream real-time detector designed for target detection in extreme environments such as nighttime and fog, which is able to efficiently utilise both visible and infrared images to achieve Fast All-Weatherenvironment sensing (FAWDet). Firstly, in order to allow the network to process information from different modalities simultaneously, this paper expands the state-of-the-art end-to-end detector YOLOv8, the backbone is expanded in parallel as a dual stream. Then, for purpose of avoid information loss in the process of network deepening, a cross-modal feature enhancement module is designed in this study, which enhances each modal feature by cross-modal attention mechanisms, thus effectively avoiding information loss and improving the detection capability of small targets. In addition, for the significant differences between modal features, this paper proposes a three-stage fusion strategy to optimise the feature integration through the fusion of spatial, channel and overall dimensions. It is worth mentioning that the cross-modal feature fusion module adopts an end-to-end training approach. Extensive experiments on two datasets validate that the proposed method achieves state-of-the-art performance in detecting small targets. The cross-modal real-time detector in this study not only demonstrates excellent stability and robust detection performance, but also provides a new solution for target detection techniques in extreme environments. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 14248220
- Volume :
- 24
- Issue :
- 13
- Database :
- Academic Search Index
- Journal :
- Sensors (14248220)
- Publication Type :
- Academic Journal
- Accession number :
- 178413296
- Full Text :
- https://doi.org/10.3390/s24134098