Back to Search Start Over

Multiple depth-levels features fusion enhanced network for action recognition.

Authors :
Wang, Shengquan
Kong, Jun
Jiang, Min
Liu, Tianshan
Source :
Journal of Visual Communication & Image Representation. Nov2020, Vol. 73, pN.PAG-N.PAG. 1p.
Publication Year :
2020

Abstract

As a challenging task of video classification, action recognition has become a significant topic of computer vision community. The most popular methods based on two-stream architecture up to now are still simply fusing the prediction scores of each stream. In that case, the complementary characteristics of two streams cannot be fully utilized and the effect of shallower features is often overlooked. In addition, the equal treatment to features may weaken the role of the feature contributing significantly to the classification. Accordingly, a novel network called Multiple Depth-levels Features Fusion Enhanced Network (MDFFEN) is proposed. It improves on two aspects of two-stream architecture. In terms of the two-stream interaction mechanism, multiple depth-levels features fusion (MDFF) is formed to aggregate spatial–temporal features extracted from several sub-modules of original two streams by spatial–temporal features fusion (STFF). And with respect to further refining the spatiotemporal features, we propose a group-wise spatial-channel enhance (GSCE) module to highlight the meaningful regions and expressive channels automatically by priority assignment. The competitive results are achieved after we validate MDFFEN on three public challenging action recognition datasets, HDMB51, UCF101 and ChaLearn LAP IsoGD. • The proposed Two-Stream Features Fusion extracts diverse hybrid features. • The fusion of three types of features learns more discriminative information. • The Group-wise Spatial-Channel Enhance mechanism reduces interference information. • The proposed network achieves competitive performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10473203
Volume :
73
Database :
Academic Search Index
Journal :
Journal of Visual Communication & Image Representation
Publication Type :
Academic Journal
Accession number :
147318736
Full Text :
https://doi.org/10.1016/j.jvcir.2020.102929