Back to Search
Start Over
AerialFormer: Multi-Resolution Transformer for Aerial Image Segmentation.
- Source :
- Remote Sensing; Aug2024, Vol. 16 Issue 16, p2930, 27p
- Publication Year :
- 2024
-
Abstract
- When performing remote sensing image segmentation, practitioners often encounter various challenges, such as a strong imbalance in the foreground–background, the presence of tiny objects, high object density, intra-class heterogeneity, and inter-class homogeneity. To overcome these challenges, this paper introduces AerialFormer, a hybrid model that strategically combines the strengths of Transformers and Convolutional Neural Networks (CNNs). AerialFormer features a CNN Stem module integrated to preserve low-level and high-resolution features, enhancing the model's capability to process details of aerial imagery. The proposed AerialFormer is designed with a hierarchical structure, in which a Transformer encoder generates multi-scale features and a multi-dilated CNN (MDC) decoder aggregates the information from the multi-scale inputs. As a result, information is taken into account in both local and global contexts, so that powerful representations and high-resolution segmentation can be achieved. The proposed AerialFormer was benchmarked on three benchmark datasets, including iSAID, LoveDA, and Potsdam. Comprehensive experiments and extensive ablation studies show that the proposed AerialFormer remarkably outperforms state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 20724292
- Volume :
- 16
- Issue :
- 16
- Database :
- Complementary Index
- Journal :
- Remote Sensing
- Publication Type :
- Academic Journal
- Accession number :
- 179355243
- Full Text :
- https://doi.org/10.3390/rs16162930