Back to Search Start Over

Incorporating semantic consistency for improved semi-supervised image captioning.

Authors :
Wu, Bicheng
Wo, Yan
Source :
Multimedia Tools & Applications; May2024, Vol. 83 Issue 17, p52931-52955, 25p
Publication Year :
2024

Abstract

The high labor cost of image captioning datasets limits the application scenarios of image captioning methods. Therefore, the semi-supervised image captioning research that utilizes partially labeled datasets and a large amount of unlabeled data has gained widespread attention in recent years. The key issue of current semi-supervised image captioning research is how to obtain pseudo-labels that well match unlabeled images, providing valuable training samples for semi-supervised model training. To this end, we propose a semi-supervised image captioning method improved by incorporating semantic consistency (Semi-SC), which adopts both self-training and adversarial training for Teacher and Student models. Semi-SC constructs a semantic consistency discriminator to evaluate data of two modalities with global and local semantic similarity, which helps to filter out high-quality paired pseudo-samples from Teacher model to optimize the training of for Student model. To improve the semantic consistency between the generated captions and original images, a semantic confidence loss is designed to inject important semantic information of images into the generated captions with the global semantic content. Extensive experiments on the MSCOCO dataset and Unlabeled-COCO dataset verify the effectiveness of Semi-SC, which shows significant advantages in CIDEr and SPICE metrics, achieving 78.1 % and 15.8 % in the Scarcely-paired COCO setting and outperforming other existing semi-supervised image captioning methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13807501
Volume :
83
Issue :
17
Database :
Complementary Index
Journal :
Multimedia Tools & Applications
Publication Type :
Academic Journal
Accession number :
177251269
Full Text :
https://doi.org/10.1007/s11042-023-17577-y