Start Over

A Hierarchical Spatio-Temporal Model for Human Activity Recognition.

Authors :: Xu, Wanru
Miao, Zhenjiang
Zhang, Xiao-Ping
Tian, Yi
Source :: IEEE Transactions on Multimedia; Jul2017, Vol. 19 Issue 7, p1494-1509, 16p
Publication Year :: 2017
Abstract: There are two key issues in human activity recognition: spatial dependencies and temporal dependencies. Most recent methods focus on only one of them, and thus do not have sufficient descriptive power to recognize complex activity. In this paper, we propose a hierarchical spatio-temporal model (HSTM) to solve the problem by modeling spatial and temporal constraints simultaneously. The new HSTM is a two-layer hidden conditional random field (HCRF), where the bottom-layer HCRF aims at describing spatial relations in each frame and learning more discriminative representations, and the top-layer HCRF utilizes these high-level features to characterize temporal relations in the whole video sequence. The new HSTM takes advantage of the bottom layer as the building blocks for the top layer and it aggregates evidence from local to global level. A novel learning algorithm is derived to train all model parameters efficiently and its effectiveness is validated theoretically. Experimental results show that the HSTM can successfully classify human activities with higher accuracies on single-person actions (UCF) than other existing methods. More importantly, the HSTM also achieves superior performance on more practical interactions, including human–human interactional activities (UT-Interaction, BIT-Interaction, and CASIA) and human–object interactional activities (Gupta video dataset). [ABSTRACT FROM PUBLISHER]