Back to Search Start Over

Visual preferences prediction for a photo gallery based on image captioning methods

Authors :
A.S. Kharchevnikova
A.V. Savchenko
Source :
Компьютерная оптика, Vol 44, Iss 4, Pp 618-626 (2020)
Publication Year :
2020
Publisher :
Samara National Research University, 2020.

Abstract

The paper considers a problem of extracting user preferences based on their photo gallery. We propose a novel approach based on image captioning, i.e., automatic generation of textual descriptions of photos, and their classification. Known image captioning methods based on convolutional and recurrent (Long short-term memory) neural networks are analyzed. We train several models that combine the visual features of a photograph and the outputs of an Long short-term memory block by using Google's Conceptual Captions dataset. We examine application of natural language processing algorithms to transform obtained textual annotations into user preferences. Experimental studies are carried out using Microsoft COCO Captions, Flickr8k and a specially collected dataset reflecting the user’s interests. It is demonstrated that the best quality of preference prediction is achieved using keyword search methods and text summarization from Watson API, which are 8 % more accurate compared to traditional latent Dirichlet allocation. Moreover, descriptions generated by trained neural models are classified 1 – 7 % more accurately when compared to known image captioning models.

Details

Language :
English, Russian
ISSN :
24126179 and 01342452
Volume :
44
Issue :
4
Database :
Directory of Open Access Journals
Journal :
Компьютерная оптика
Publication Type :
Academic Journal
Accession number :
edsdoj.2815784d4aec4c5eab702102229e5541
Document Type :
article
Full Text :
https://doi.org/10.18287/2412-6179-CO-678