Back to Search Start Over

Multi-View Fusion for Sign Language Recognition through Knowledge Transfer Learning

Authors :
Gao, Liqing
Zhu, Lei
Xue, Senhua
Wan, Liang
Li, Ping
Feng, Wei
Gao, Liqing
Zhu, Lei
Xue, Senhua
Wan, Liang
Li, Ping
Feng, Wei
Publication Year :
2022

Abstract

Word-level sign language recognition (WSLR), which aims to translate a sign video into one word, serves as a fundamental task in visual sign language research. Existing WSLR methods focus on recognizing frontal view hand images, which may hurt performance due to hand occlusion. However, non-frontal view hand images contain complementary and beneficial information that can be used to enhance the frontal view hand images. Based on this observation, the paper presents an end-to-end Multi-View Knowledge Transfer (MVKT) network, which, to our knowledge, is the first SLR work to learn visual features from multiple views simultaneously. The model consists of three components: 1) the 3D-ResNet backbone, to extract view-common and view-specific representations; 2) the Knowledge Transfer module, to interchange complementary information between views; and 3) the View Fusion module, to aggregate discriminative representations for obtaining global clues. In addition, we construct a Multi-View Sign Language (MVSL) dataset, which contains 10,500 sign language videos synchronously collected from multiple views with clear annotations and high quality. Extensive experiments on the MVSL dataset shows that the MVKT model trained with multiple views can achieve significant improvement when tested with either multiple or single views, which makes it feasible and effective in real-world applications. © 2022 ACM.

Details

Database :
OAIster
Notes :
English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1383747198
Document Type :
Electronic Resource