1. Swin-Fusion: Swin-Transformer with Feature Fusion for Human Action Recognition.
- Author
-
Chen, Tiansheng and Mo, Lingfei
- Subjects
CONVOLUTIONAL neural networks ,FEATURE extraction ,RECOGNITION (Psychology) ,TRANSFORMER models ,IMAGE recognition (Computer vision) ,HUMAN activity recognition ,COMPUTER vision - Abstract
Human action recognition based on still images is one of the most challenging computer vision tasks. In the past decade, convolutional neural networks (CNNs) have developed rapidly and achieved good performance in human action recognition tasks based on still images. Due to the absence of the remote perception ability of CNNs, it is challenging to have a global structural understanding of human behavior and the overall relationship between the behavior and the environment. Recently, transformer-based models have been making a splash in computer vision, even reaching SOTA in several vision tasks. We explore the transformer's capability in human action recognition based on still images and add a simple but effective feature fusion module based on the Swin-Transformer model. More specifically, we propose a new transformer-based model for behavioral feature extraction that uses a pre-trained Swin-Transformer as the backbone network. Swin-Transformer's distinctive hierarchical structure, combined with the feature fusion module, is used to extract and fuse multi-scale behavioral information. Extensive experiments were conducted on five still image-based human action recognition datasets, including the Li's action dataset, the Stanford-40 dataset, the PPMI-24 dataset, the AUC-V1 dataset, and the AUC-V2 dataset. Results indicate that our proposed Swin-Fusion model achieves better behavior recognition than previously improved CNN-based models by sharing and reusing feature maps of different scales at multiple stages, without modifying the original backbone training method and with only increasing training resources by 1.6%. The code and models will be available at https://github.com/cts4444/Swin-Fusion. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF