CSNet: a ConvNeXt-based Siamese network for RGB-D salient object detection.

Authors :: Zhang, Yunhua
Wang, Hangxu
Yang, Gang
Zhang, Jianhao
Gong, Congjin
Wang, Yutao
Source :: Visual Computer; Mar2024, Vol. 40 Issue 3, p1805-1823, 19p
Publication Year :: 2024
Abstract: Global contexts are critical to locating salient objects for salient object detection (SOD). However, the convolution operation in CNNs has a local receptive field, which cannot capture long-distance global information. Recent studies have shown that modernized CNN models with large kernel convolution, such as ConvNeXt, can effectively extend the receptive fields. Based on it, this paper explores the potential of large kernel CNN for SOD task. Inspired by the common information between RGB and depth images in salient objects, we propose a ConvNeXt-based Siamese network with shared weight parameters. This structural design can effectively reduce the number of parameters without sacrificing performance. Furthermore, a depth information preprocessing module is proposed to minimize the impact of low-quality depth images on predicted saliency maps. For cross-modal feature interaction, a dynamic fusion module is designed to enhance cross-modal complementarity dynamically. Extensive experiments and evaluation results on six benchmark datasets demonstrate the outstanding performance of the proposed method against 14 state-of-the-art RGB-D methods. Our code will be released at https://github.com/zyh5119232/CSNet. [ABSTRACT FROM AUTHOR]