Back to Search Start Over

AerialFormer: Multi-Resolution Transformer for Aerial Image Segmentation.

Authors :
Hanyu, Taisei
Yamazaki, Kashu
Tran, Minh
McCann, Roy A.
Liao, Haitao
Rainwater, Chase
Adkins, Meredith
Cothren, Jackson
Le, Ngan
Source :
Remote Sensing; Aug2024, Vol. 16 Issue 16, p2930, 27p
Publication Year :
2024

Abstract

When performing remote sensing image segmentation, practitioners often encounter various challenges, such as a strong imbalance in the foreground–background, the presence of tiny objects, high object density, intra-class heterogeneity, and inter-class homogeneity. To overcome these challenges, this paper introduces AerialFormer, a hybrid model that strategically combines the strengths of Transformers and Convolutional Neural Networks (CNNs). AerialFormer features a CNN Stem module integrated to preserve low-level and high-resolution features, enhancing the model's capability to process details of aerial imagery. The proposed AerialFormer is designed with a hierarchical structure, in which a Transformer encoder generates multi-scale features and a multi-dilated CNN (MDC) decoder aggregates the information from the multi-scale inputs. As a result, information is taken into account in both local and global contexts, so that powerful representations and high-resolution segmentation can be achieved. The proposed AerialFormer was benchmarked on three benchmark datasets, including iSAID, LoveDA, and Potsdam. Comprehensive experiments and extensive ablation studies show that the proposed AerialFormer remarkably outperforms state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20724292
Volume :
16
Issue :
16
Database :
Complementary Index
Journal :
Remote Sensing
Publication Type :
Academic Journal
Accession number :
179355243
Full Text :
https://doi.org/10.3390/rs16162930