1. Foveated convolutional neural networks for video summarization
- Author
-
Sheng-hua Zhong, Jianmin Jiang, Stephen J. Heinen, Zheng Ma, and Jiaxin Wu
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Eye movement ,020207 software engineering ,02 engineering and technology ,Convolutional neural network ,Automatic summarization ,Motion (physics) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Spatial analysis ,Software - Abstract
With the proliferation of video data, video summarization is an ideal tool for users to browse video content rapidly. In this paper, we propose a novel foveated convolutional neural networks for dynamic video summarization. We are the first to integrate gaze information into a deep learning network for video summarization. Foveated images are constructed based on subjects’ eye movements to represent the spatial information of the input video. Multi-frame motion vectors are stacked across several adjacent frames to convey the motion clues. To evaluate the proposed method, experiments are conducted on two video summarization benchmark datasets. The experimental results validate the effectiveness of the gaze information for video summarization despite the fact that the eye movements are collected from different subjects from those who generated summaries. Empirical validations also demonstrate that our proposed foveated convolutional neural networks for video summarization can achieve state-of-the-art performances on these benchmark datasets.
- Published
- 2018
- Full Text
- View/download PDF