Back to Search
Start Over
Audio-visual saliency prediction for movie viewing in immersive environments: Dataset and benchmarks.
- Source :
-
Journal of Visual Communication & Image Representation . Apr2024, Vol. 100, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- In this paper, an eye-tracking dataset of movie viewing in the immersive environment is developed, which contains 256 movie clips with 2K QHD resolution and corresponding movie genre labels from IMDb (Internet Movie Database). The dataset provides the audio-visual clues for studying the human visual attention when watching movie using a VR headset, by recording the eye movements using integrated eye tracker. To provide benchmarks for a saliency prediction for movie viewing in the immersive environment, fifteen computational models are evaluated on the dataset, including a newly developed multi-stream audio-visual saliency prediction model based on deep neural networks, named as MSAV. Detailed quantitative and qualitative comparisons and analyses are also provided. The developed dataset and benchmarks could help to facilitate the studies of visual saliency prediction for movie viewing in the immersive environments. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10473203
- Volume :
- 100
- Database :
- Academic Search Index
- Journal :
- Journal of Visual Communication & Image Representation
- Publication Type :
- Academic Journal
- Accession number :
- 176784533
- Full Text :
- https://doi.org/10.1016/j.jvcir.2024.104095