Back to Search Start Over

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-Level Product Retrieval

Authors :
Dong, Xiao
Zhan, Xunlin
Wei, Yunchao
Wei, Xiaoyong
Wang, Yaowei
Lu, Minlong
Cao, Xiaochun
Liang, Xiaodan
Source :
IEEE Transactions on Pattern Analysis and Machine Intelligence; November 2023, Vol. 45 Issue: 11 p13117-13133, 17p
Publication Year :
2023

Abstract

Our goal in this research is to study a more realistic environment in which we can conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories. We first contribute the Product1M datasets and define two real practical instance-level retrieval tasks that enable evaluations on price comparison and personalized recommendations. For both instance-level tasks, accurately identifying the intended product target mentioned in visual-linguistic data and mitigating the impact of irrelevant content are quite challenging. To address this, we devise a more effective cross-modal pretraining model capable of adaptively incorporating key concept information from multi-modal data. This is accomplished by utilizing an entity graph, where nodes represented entities and edges denoted the similarity relations between them. Specifically, a novel Entity-Graph Enhanced Cross-Modal Pretraining (EGE-CMP) model is proposed for instance-level commodity retrieval, which explicitly injects entity knowledge in both node-based and subgraph-based ways into the multi-modal networks via a self-supervised hybrid-stream transformer. This could reduce the confusion between different object contents, thereby effectively guiding the network to focus on entities with real semantics. Experimental results sufficiently verify the efficacy and generalizability of our EGE-CMP, outperforming several SOTA cross-modal baselines like CLIP Radford et al. 2021, UNITER Chen et al. 2020 and CAPTURE Zhan et al. 2021.

Details

Language :
English
ISSN :
01628828
Volume :
45
Issue :
11
Database :
Supplemental Index
Journal :
IEEE Transactions on Pattern Analysis and Machine Intelligence
Publication Type :
Periodical
Accession number :
ejs64146968
Full Text :
https://doi.org/10.1109/TPAMI.2023.3291237