Back to Search Start Over

A CNN-Transformer Hybrid Model Based on CSWin Transformer for UAV Image Object Detection

Authors :
Wanjie Lu
Chaozhen Lan
Chaoyang Niu
Wei Liu
Liang Lyu
Qunshan Shi
Shiju Wang
Source :
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol 16, Pp 1211-1231 (2023)
Publication Year :
2023
Publisher :
IEEE, 2023.

Abstract

The object detection of unmanned aerial vehicle (UAV) images has widespread applications in numerous fields; however, the complex background, diverse scales, and uneven distribution of objects in UAV images make object detection a challenging task. This study proposes a convolution neural network transformer hybrid model to achieve efficient object detection in UAV images, which has three advantages that contribute to improving object detection performance. First, the efficient and effective cross-shaped window (CSWin) transformer can be used as a backbone to obtain image features at different levels, and the obtained features can be input into the feature pyramid network to achieve multiscale representation, which will contribute to multiscale object detection. Second, a hybrid patch embedding module is constructed to extract and utilize low-level information such as the edges and corners of the image. Finally, a slicing-based inference method is constructed to fuse the inference results of the original image and sliced images, which will improve the small object detection accuracy without modifying the original network. Experimental results on public datasets illustrate that the proposed method can improve performance more effectively than several popular and state-of-the-art object detection methods.

Details

Language :
English
ISSN :
21511535
Volume :
16
Database :
Directory of Open Access Journals
Journal :
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Publication Type :
Academic Journal
Accession number :
edsdoj.fdbb51c46f824e46a17513bbed48f89e
Document Type :
article
Full Text :
https://doi.org/10.1109/JSTARS.2023.3234161