Back to Search Start Over

Cross-Modal Attention With Semantic Consistence for Image–Text Matching.

Authors :
Xu, Xing
Wang, Tan
Yang, Yang
Zuo, Lin
Shen, Fumin
Shen, Heng Tao
Source :
IEEE Transactions on Neural Networks & Learning Systems; Dec2020, Vol. 31 Issue 12, p5412-5425, 14p
Publication Year :
2020

Abstract

The task of image–text matching refers to measuring the visual-semantic similarity between an image and a sentence. Recently, the fine-grained matching methods that explore the local alignment between the image regions and the sentence words have shown advance in inferring the image–text correspondence by aggregating pairwise region-word similarity. However, the local alignment is hard to achieve as some important image regions may be inaccurately detected or even missing. Meanwhile, some words with high-level semantics cannot be strictly corresponding to a single-image region. To tackle these problems, we address the importance of exploiting the global semantic consistence between image regions and sentence words as complementary for the local alignment. In this article, we propose a novel hybrid matching approach named Cross-modal Attention with Semantic Consistency (CASC) for image–text matching. The proposed CASC is a joint framework that performs cross-modal attention for local alignment and multilabel prediction for global semantic consistence. It directly extracts semantic labels from available sentence corpus without additional labor cost, which further provides a global similarity constraint for the aggregated region-word similarity obtained by the local alignment. Extensive experiments on Flickr30k and Microsoft COCO (MSCOCO) data sets demonstrate the effectiveness of the proposed CASC on preserving global semantic consistence along with the local alignment and further show its superior image–text matching performance compared with more than 15 state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
2162237X
Volume :
31
Issue :
12
Database :
Complementary Index
Journal :
IEEE Transactions on Neural Networks & Learning Systems
Publication Type :
Periodical
Accession number :
147401158
Full Text :
https://doi.org/10.1109/TNNLS.2020.2967597