Back to Search Start Over

ESRTMDet: An End-to-End Super-Resolution Enhanced Real-Time Rotated Object Detector for Degraded Aerial Images

Authors :
Fei Liu
Renwen Chen
Junyi Zhang
Shanshan Ding
Hao Liu
Shaofei Ma
Kailing Xing
Source :
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol 16, Pp 4983-4998 (2023)
Publication Year :
2023
Publisher :
IEEE, 2023.

Abstract

The degradation of image resolution reduces the detection performance in aerial imagery because it generates a large number of small objects, and accurately detecting these small objects remains a challenge. Existing methods mostly use a superresolution (SR) model to first obtain the SR image of the low-resolution degraded image ($I^{\text{LR}}$) and then use this image as the input of the object detection (OD) network to solve this problem. However, this architecture that involves executing a complex SR network before the detector is time-consuming and makes it hard to achieve real-time model inference. To address this challenge, we propose a simple and effective rotated small OD method, named end-to-end superresolution enhanced real-time rotated object detector (ESRTMDet). First, we design a lightweight embedded feature map superresolution module (ESRM) embedded in the detection model to enhance and amplify the backbone output features, making the detection heads detect small objects more easily. Furthermore, we train a parallel SR network branch (PSRB) simultaneously that uses the backbone feature to restore a high-resolution image. Through our proposed feature alignment loss and feature affinity layer, our PSRB effectively guides the feature map enhancement of ESRM. Finally, through end-to-end joint optimization of the detector and PSRB, the detection performance on $I^{\text{LR}}$ is significantly improved. Extensive experiments over DOTA and UCAS-AOD demonstrate that our method can achieve state-of-the-art results. In addition, we discard our PSRB and use $I^{\text{LR}}$ as the input during inference, reducing the inference time-consuming of our model. Therefore, our ESRTMDet-X not only achieves 77.11% mean of average precision on the degraded DOTA dataset, but also achieves an amazing inference speed of 337 FPS, thus obtaining the best speed–accuracy tradeoff.

Details

Language :
English
ISSN :
21511535
Volume :
16
Database :
Directory of Open Access Journals
Journal :
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Publication Type :
Academic Journal
Accession number :
edsdoj.9b472233ac3941a6969b2ae4bfeb83ae
Document Type :
article
Full Text :
https://doi.org/10.1109/JSTARS.2023.3278295