Back to Search
Start Over
Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection.
- Source :
-
Electronic Research Archive . 2024, Vol. 32 Issue 1, p1-27. 27p. - Publication Year :
- 2024
-
Abstract
- The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 26881594
- Volume :
- 32
- Issue :
- 1
- Database :
- Academic Search Index
- Journal :
- Electronic Research Archive
- Publication Type :
- Academic Journal
- Accession number :
- 178380251
- Full Text :
- https://doi.org/10.3934/era.2024031