1. HeightFormer: A Multilevel Interaction and Image-Adaptive Classification–Regression Network for Monocular Height Estimation with Aerial Images.
- Author
-
Chen, Zhan, Zhang, Yidan, Qi, Xiyu, Mao, Yongqiang, Zhou, Xin, Wang, Lei, and Ge, Yunping
- Subjects
- *
MONOCULARS , *IMAGE segmentation , *PIXELS , *COMPUTATIONAL complexity - Abstract
Height estimation has long been a pivotal topic within measurement and remote sensing disciplines, with monocular height estimation offering wide-ranging data sources and convenient deployment. This paper addresses the existing challenges in monocular height estimation methods, namely the difficulty in simultaneously achieving high-quality instance-level height and edge reconstruction, along with high computational complexity. This paper presents a comprehensive solution for monocular height estimation in remote sensing, termed HeightFormer, combining multilevel interactions and image-adaptive classification–regression. It features the Multilevel Interaction Backbone (MIB) and Image-adaptive Classification–regression Height Generator (ICG). MIB supplements the fixed sample grid in the CNN of the conventional backbone network with tokens of different interaction ranges. It is complemented by a pixel-, patch-, and feature map-level hierarchical interaction mechanism, designed to relay spatial geometry information across different scales and introducing a global receptive field to enhance the quality of instance-level height estimation. The ICG dynamically generates height partition for each image and reframes the traditional regression task, using a refinement from coarse to fine classification–regression that significantly mitigates the innate ill-posedness issue and drastically improves edge sharpness. Finally, the study conducts experimental validations on the Vaihingen and Potsdam datasets, with results demonstrating that our proposed method surpasses existing techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF