Cross-modality motion parameterization for fine-grained video prediction.

Authors :: Yan, Yichao
Ni, Bingbing
Zhang, Wendong
Tang, Jun
Yang, Xiaokang
Source :: Computer Vision & Image Understanding; Jun2019, Vol. 183, p11-19, 9p
Publication Year :: 2019
Abstract: While predicting video content is challenging given the huge unconstrained searching space, this work explores cross-modality constraints to safeguard the video generation process and seeks improved content prediction. By observing the underlying correspondence between the sound and the object movement, we propose a novel cross-modality video generation network. Via adversarial training, this network directly links sound with the movement parameters of the operated object and automatically outputs corresponding object motion according to the rhythm of the given audio signal. We experiment on both rigid object and non-rigid object motion prediction tasks and show that our method significantly reduces motion uncertainty for the generated video content, with the guidance of the associated audio information. • We propose to predict motion parameters to improve video prediction quality. • We employ audio information as cross-modality guidance for motion parameter prediction. • Our framework can be applied to both rigid object and non-rigid object motion prediction tasks. [ABSTRACT FROM AUTHOR]