1. HTCViT: an effective network for image classification and segmentation based on natural disaster datasets.
- Author
-
Ma, Zhihao, Li, Wei, Zhang, Muyang, Meng, Weiliang, Xu, Shibiao, and Zhang, Xiaopeng
- Subjects
IMAGE recognition (Computer vision) ,IMAGE segmentation ,NATURAL disasters ,STRENGTH training - Abstract
Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network's attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3 - 4 % and the segmentation task with 1 - 2 % on natural disaster datasets compared to the vanilla ViT network. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF