Back to Search Start Over

Sports Video Captioning via Attentive Motion Representation and Group Relationship Modeling.

Authors :
Qi, Mengshi
Wang, Yunhong
Li, Annan
Luo, Jiebo
Source :
IEEE Transactions on Circuits & Systems for Video Technology. Aug2020, Vol. 30 Issue 8, p2617-2633. 17p.
Publication Year :
2020

Abstract

Sports video captioning refers to the task of automatically generating a textual description for sports events (football, basketball, or volleyball games). Although a great deal of previous work has shown promising performance in producing a coarse and a general description of a video but lack of professional sports knowledge, it is still quite challenging to caption a sports video with multiple fine-grained player’s actions and complex group relationship between players. In this paper, we present a novel hierarchical recurrent neural network-based framework with an attention mechanism for sports video captioning, in which a motion representation module is proposed to capture individual pose attribute and dynamical trajectory cluster information with extra professional sports knowledge, and a group relationship module is employed to design a scene graph for modeling players’ interaction by a gated graph convolutional network. Moreover, we introduce a new dataset called sports video captioning dataset-volleyball for evaluation. The proposed model is evaluated on three widely adopted public datasets and our collected new dataset, on which the effectiveness of our method is well demonstrated. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10518215
Volume :
30
Issue :
8
Database :
Academic Search Index
Journal :
IEEE Transactions on Circuits & Systems for Video Technology
Publication Type :
Academic Journal
Accession number :
145130455
Full Text :
https://doi.org/10.1109/TCSVT.2019.2921655