Back to Search Start Over

Action class relation detection and classification across multiple video datasets.

Authors :
Yoshikawa, Yuya
Shigeto, Yutaro
Shimbo, Masashi
Takeuchi, Akikazu
Source :
Pattern Recognition Letters. Sep2023, Vol. 173, p93-100. 8p.
Publication Year :
2023

Abstract

The Meta Video Dataset (MetaVD) provides annotated relations between action classes in major datasets for human action recognition in videos. Although these annotated relations enable dataset augmentation, it is only applicable to those covered by MetaVD. For an external dataset to enjoy the same benefit, the relations between its action classes and those in MetaVD need to be determined. To address this issue, we consider two new machine learning tasks: action class relation detection and classification. We propose a unified model to predict relations between action classes, using language and visual information associated with classes. Experimental results show that (i) pre-trained recent neural network models for texts and videos contribute to high predictive performance, (ii) the relation prediction based on action label texts is more accurate than based on videos, and (iii) a blending approach that combines predictions by both modalities can further improve the predictive performance in some cases. • Two proposed tasks aim to predict relations between action classes. • The ground-truth relations are provided from Meta Video Dataset (MetaVD). • Recent pre-trained models in NLP and CV are useful in the tasks. • Action label texts contribute to higher predictive performance than videos. • Using both action label texts and videos can improve the performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01678655
Volume :
173
Database :
Academic Search Index
Journal :
Pattern Recognition Letters
Publication Type :
Academic Journal
Accession number :
171311689
Full Text :
https://doi.org/10.1016/j.patrec.2023.08.002