Back to Search Start Over

SSTNet: Saliency sparse transformers network with tokenized dilation for salient object detection.

Authors :
Yang, Mo
Liu, Ziyan
Dong, Wen
Wu, Ying
Source :
IET Image Processing (Wiley-Blackwell). 11/13/2023, Vol. 17 Issue 13, p3759-3776. 18p.
Publication Year :
2023

Abstract

The vision Transformer structure performs better in salient object detection than the convolutional neural network (CNN)‐based approach. Vision Transformer predicts saliency by modelling long‐range dependencies from sequence to sequence with convolution‐free. It is challenging to distinguish the salient objects' location and obtain structural details for the influence of extracting irrelevant contextual information. A novel saliency sparse Transformer network is proposed to exploit sparse attention to guide saliency prediction. The convolution‐like with dilation in the token to token (T2T) module is replaced to achieve relationships in larger regions and to improve contextual information fusion. An adaptive position bias module is designed for the Vision Transformer to make position bias suitable for variable‐sized RGB images. A saliency sparse Transformer module is designed to improve the concentration of attention on the global context by selecting the Top‐k of the most relevant segments to improve the detection results further. Besides, cross‐modality to exploit the complementary RGB and depth modality fusion module (CMF) is used to take advantage of the complementary RGB image features and spatial depth information to enhance the feature fusion performance. Extensive experiments on multiple benchmark datasets demonstrate this method's effectiveness and superiority that it is suitable for saliency prediction comparable to state‐of‐the‐art RGB and RGB‐D saliency methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
17519659
Volume :
17
Issue :
13
Database :
Academic Search Index
Journal :
IET Image Processing (Wiley-Blackwell)
Publication Type :
Academic Journal
Accession number :
173439854
Full Text :
https://doi.org/10.1049/ipr2.12895