Back to Search Start Over

Transformer-based cross-modality interaction guidance network for RGB-T salient object detection.

Authors :
Luo, Jincheng
Li, Yongjun
Li, Bo
Zhang, Xinru
Li, Chaoyue
Chenjin, Zhimin
He, Jingyi
Liang, Yifei
Source :
Neurocomputing. Oct2024, Vol. 600, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Exploring more effective multimodal fusion strategies is still challenging for RGB-T salient object detection (SOD). Most RGB-T SOD methods tend to focus on the strategy of acquiring modal complementary features by utilizing foreground information while ignoring the importance of background information for salient object localization. In addition, feature fusion without information filtering may introduce more noise. To solve these problems, this paper proposes a new cross-modal interaction guidance network (CIGNet) for RGB-T saliency object detection. Specifically, we construct a transformer-based dual-stream encoder to extract multimodal features. In the decoder, we propose an attention mechanism-based modal information complementary module (MICM) for capturing cross-modal complementary information for global comparison and salient object localization. Based on the MICM features, we design a multi-scale adaptive fusion module (MAFM) to find the optimal salient region of the multi-scale fusion process and reduce redundant features. In order to enhance the completeness of salient features after multi-scale feature fusion, this paper proposes the saliency region mining module (SRMM), which corrects the features in the boundary neighborhood by exploiting the differences between foreground and background pixels and the boundary. Comparisons with other state-of-the-art methods on three RGB-T datasets and five RGB-D datasets, the experimental results demonstrate the superiority and extensiveness of the proposed CIGNet. • The significance of thermal images in salient object detection was investigated. • Using attentional mechanisms to capture multimodal complementary features. • Importance of background pixels for correcting salient object boundary features. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
600
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
178941727
Full Text :
https://doi.org/10.1016/j.neucom.2024.128149