Back to Search Start Over

Two-stream spatiotemporal feature fusion for human action recognition.

Authors :
Abdelbaky, Amany
Aly, Saleh
Source :
Visual Computer. Jul2021, Vol. 37 Issue 7, p1821-1835. 15p.
Publication Year :
2021

Abstract

Human action recognition is still a challenging topic in the computer vision field that has attracted a large number of researchers. It has a significant importance in varieties of applications such as intelligent video surveillance, sports analysis, and human–computer interaction. Recent works attempt to exploit the progress in deep learning architecture to learn spatial and temporal features from action video. However, it remains unclear how to combine spatial and temporal information with convolutional neural network. In this paper, we propose a novel human action recognition method by fusing spatial and temporal features learned from a simple unsupervised convolutional neural network called principal component analysis network (PCANet) in combination with bag-of-features (BoF) and vector of locally aggregated descriptors (VLAD) encoding schemes. Firstly, both spatial and temporal features are learned via PCANet using a subset of frames and temporal templates for each video, while their dimensionality is reduced using whitening transformation (WT). The temporal templates are calculated using short-time motion energy images (ST-MEI) based on frame differencing. Then, the encoding scheme is applied to represent the final dual spatiotemporal PCANet features by feature fusion. Finally, the support vector machine (SVM) classifier is exploited for action recognition. Extensive experiments have been performed on two popular datasets, namely KTH and UCF sports, to evaluate the performance of proposed method. Our experimental results using leave-one-out evaluation strategy demonstrate that the proposed method presents satisfactory and comparable results on both datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01782789
Volume :
37
Issue :
7
Database :
Academic Search Index
Journal :
Visual Computer
Publication Type :
Academic Journal
Accession number :
151084646
Full Text :
https://doi.org/10.1007/s00371-020-01940-3