Back to Search Start Over

Audio-visual saliency prediction for movie viewing in immersive environments: Dataset and benchmarks.

Authors :
Chen, Zhao
Zhang, Kao
Cai, Hao
Ding, Xiaoying
Jiang, Chenxi
Chen, Zhenzhong
Source :
Journal of Visual Communication & Image Representation. Apr2024, Vol. 100, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

In this paper, an eye-tracking dataset of movie viewing in the immersive environment is developed, which contains 256 movie clips with 2K QHD resolution and corresponding movie genre labels from IMDb (Internet Movie Database). The dataset provides the audio-visual clues for studying the human visual attention when watching movie using a VR headset, by recording the eye movements using integrated eye tracker. To provide benchmarks for a saliency prediction for movie viewing in the immersive environment, fifteen computational models are evaluated on the dataset, including a newly developed multi-stream audio-visual saliency prediction model based on deep neural networks, named as MSAV. Detailed quantitative and qualitative comparisons and analyses are also provided. The developed dataset and benchmarks could help to facilitate the studies of visual saliency prediction for movie viewing in the immersive environments. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10473203
Volume :
100
Database :
Academic Search Index
Journal :
Journal of Visual Communication & Image Representation
Publication Type :
Academic Journal
Accession number :
176784533
Full Text :
https://doi.org/10.1016/j.jvcir.2024.104095