Back to Search
Start Over
Self-Supervised Pretraining With Monocular Height Estimation for Semantic Segmentation
- Source :
- IEEE Transactions on Geoscience and Remote Sensing; 2024, Vol. 62 Issue: 1 p1-12, 12p
- Publication Year :
- 2024
-
Abstract
- Monocular height estimation (MHE) is key for generating 3-D city models, essential for swift disaster response. Moving beyond the traditional focus on performance enhancement, our study breaks new ground by probing the interpretability of MHE networks. We have pioneeringly discovered that neurons within MHE models demonstrate selectivity for both height and semantic classes. This insight sheds light on the complex inner workings of MHE models and inspires innovative strategies for leveraging elevation data more effectively. Informed by this insight, we propose a pioneering framework that employs MHE as a self-supervised pretraining method for remote sensing (RS) imagery. This approach significantly enhances the performance of semantic segmentation tasks. Furthermore, we develop a disentangled latent transformer (DLT) module that leverages explainable deep representations from pretrained MHE networks for unsupervised semantic segmentation. Our method demonstrates the significant potential of MHE tasks in developing foundation models for sophisticated pixel-level semantic analyses. Additionally, we present a new dataset designed to benchmark the performance of both semantic segmentation and height estimation tasks. The dataset and code will be publicly available at <uri>https://github.com/zhu-xlab/DLT-MHE.pytorch</uri>.
Details
- Language :
- English
- ISSN :
- 01962892 and 15580644
- Volume :
- 62
- Issue :
- 1
- Database :
- Supplemental Index
- Journal :
- IEEE Transactions on Geoscience and Remote Sensing
- Publication Type :
- Periodical
- Accession number :
- ejs66997285
- Full Text :
- https://doi.org/10.1109/TGRS.2024.3412629