Back to Search Start Over

FactorNet: Holistic Actor, Object, and Scene Factorization for Action Recognition in Videos.

Authors :
Nigam, Nitika
Dutta, Tanima
Gupta, Hari Prabhat
Source :
IEEE Transactions on Circuits & Systems for Video Technology. Mar2022, Vol. 32 Issue 3, p976-991. 16p.
Publication Year :
2022

Abstract

The ability to recognize human actions in a video is challenging due to the complex nature of video data and the subtlety of human actions. Human activities often get associated with surrounding objects and occur in specific scene contexts. Existing action recognition systems are incapable of separating human actions from representation biases, like co-occurring objects and underlying scene, which often dominate subtle human actions. In this paper, we address the issue of factorization of human actions into the activity performed by the actor, co-occurring objects, and underlying context to mitigate the influence of representation biases when they are irrelevant to the action in consideration. We propose a deep neural network architecture, denoted by FactorNet, for efficient action recognition in videos with long temporal duration. We design an attention mechanism that separates an actor from the associated objects and co-occurring scene followed by capturing long-range temporal context. We perform a comprehensive set of experimentation on six benchmark datasets to show the efficacy of our architecture. To train a model using recent video-based action datasets certainly capture and leverages such bias. The supervised representation may not be competent to new action classes. We therefore design a new dataset, known as FactNet, which consists of activity-object-scene related actions that occur in day-to-day applications. Dataset Link: FactNet. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10518215
Volume :
32
Issue :
3
Database :
Academic Search Index
Journal :
IEEE Transactions on Circuits & Systems for Video Technology
Publication Type :
Academic Journal
Accession number :
155753967
Full Text :
https://doi.org/10.1109/TCSVT.2021.3070688