Start Over

Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network.

Authors :: Jiang, Min
Zhai, Fuhao
Kong, Jun
Source :: Visual Computer. Jul2022, Vol. 38 Issue 7, p2473-2488. 16p.
Publication Year :: 2022
Abstract: In the task of semantic segmentation, researchers often use self-attention module to capture long-range contextual information. These methods are often effective. However, the use of the self-attention module will cause a problem that cannot be ignored, that is, the huge consumption of computing resources. Therefore, how to reduce the resource consumption of the self-attention module under the premise of ensuring performance is a very meaningful research topic. In this paper, we propose a Sparse Attention Model combined with a powerful multi-task feature extraction network for semantic segmentation. Compared with the classic self-attention model, our Sparse Attention Model does not calculate the inner product between pairs of all vectors. Instead, we first sparse the feature block Query and the feature block Key defined in self-attention module through the credit matrix generated by the pre-output. Then, we perform similarity modeling on the two sparse feature blocks. Meanwhile, to ensure that the vectors in Query could capture dense contextual information, we design a Class Attention Module and embed it into Sparse Attention Module. Note that, compared with Dual Attention Network for scene segmentation, our attention module greatly reduces the consumption of computing resources while ensuring the accuracy. Furthermore, in the stage of feature extraction, the use of downsampling will cause serious loss of detailed information and affect the segmentation performance of the network, so we adopt a multi-task feature extraction network. It learns semantic features and edge features in parallel, and we feed the learned edge features into the deep layer of the network to help restore detailed information for capturing high-quality semantic features. We do not use pure concatenation. Instead, we extract the edge features related to each channel by element-wise multiplication before concatenation. Finally, we conduct experiments on three datasets: Cityscapes, PASCAL VOC2012 and ADE20K, and obtain competitive results. [ABSTRACT FROM AUTHOR]

Subjects :: *FEATURE extraction
*NETWORK performance
*MACHINE learning
*DEEP learning

Details

Language :: English
ISSN :: 01782789
Volume :: 38
Issue :: 7
Database :: Academic Search Index
Journal :: Visual Computer
Publication Type :: Academic Journal
Accession number :: 157319294
Full Text :: https://doi.org/10.1007/s00371-021-02124-3

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources