Deep Pairwise Ranking with Multi-label Information for Cross-Modal Retrieval

Authors :: Jia Zhu
Yang Cao
Jing Xiao
Jian Yangwo
Asad Khan
Source :: ICME
Publication Year :: 2019
Publisher :: IEEE, 2019.
Abstract: Cross-modal retrieval has gained much attention due to the growing demand for enormous multi-modal data in recent years (i.e., image-text or text-image retrieval). In order to alleviate the problem of ignoring the existence of irrelevant information between images and texts, this paper proposes Deep Pairwise Ranking model with multi-label information for Cross-Modal retrieval (DPRCM). DPRCM directly learns a mapping from images and texts to a compact Euclidean space where distances correspond to the similarity measure of images and texts. The bi-triplet loss function in DPRCM reduces the distance between associated images and texts on the common subspace and increases the margin of independent samples. The classification loss function can better utilize the multi-label information to reduce the semantic gap between image features and text descriptions. Experiments on three widely-used datasets show that DPRCM can achieve competitive performance compared to state-of-the-art methods.

Tools