Start Over

Attention-guided image captioning with adaptive global and local feature fusion.

Authors :: Zhong, Xian
Nie, Guozhang
Huang, Wenxin
Liu, Wenxuan
Ma, Bo
Lin, Chia-Wen
Source :: Journal of Visual Communication & Image Representation. Jul2021, Vol. 78, pN.PAG-N.PAG. 1p.
Publication Year :: 2021
Abstract: • A fusion mechanism for global feature and local object-level feature is proposed. • The mechanism relies on the spatial information of the objects in the image. • Global features and local features are adaptively fusion by attention. • The fused features are decoded captions during generating captioning. Although attention mechanisms are exploited widely in encoder-decoder neural network-based image captioning framework, the relation between the selection of salient image regions and the supervision of spatial information on local and global representation learning was overlooked, thereby degrading captioning performance. Consequently, we propose an image captioning scheme based on adaptive spatial information attention (ASIA), extracting a sequence of spatial information of salient objects in a local image region or an entire image. Specifically, in the encoding stage, we extract the object-level visual features of salient objects and their spatial bounding-box. We obtain the global feature maps of an entire image, which are fused with local features and the fused features are fed into the LSTM-based language decoder. In the decoding stage, our adaptive attention mechanism dynamically selects the corresponding image regions specified by an image description. Extensive experiments conducted on two datasets demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]

Subjects :: *IMAGE recognition (Computer vision)
*ATTENTION
*FEATURE extraction
*FLICKER fusion
*OBJECT recognition (Computer vision)
*ENCODING
*DECODERS (Electronics)

Details

Language :: English
ISSN :: 10473203
Volume :: 78
Database :: Academic Search Index
Journal :: Journal of Visual Communication & Image Representation
Publication Type :: Academic Journal
Accession number :: 151308213
Full Text :: https://doi.org/10.1016/j.jvcir.2021.103138

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Attention-guided image captioning with adaptive global and local feature fusion.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Attention-guided image captioning with adaptive global and local feature fusion.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources