Back to Search Start Over

Exploiting 3D Hand Pose Estimation in Deep Learning-Based Sign Language Recognition from RGB Videos

Authors :
Gerasimos Potamianos
Georgios Pavlakos
Petros Maragos
Maria Parelli
Katerina Papadimitriou
Source :
Computer Vision – ECCV 2020 Workshops ISBN: 9783030660956, ECCV Workshops (2)
Publication Year :
2020
Publisher :
Springer International Publishing, 2020.

Abstract

In this paper, we investigate the benefit of 3D hand skeletal information to the task of sign language (SL) recognition from RGB videos, within a state-of-the-art, multiple-stream, deep-learning recognition system. As most SL datasets are available in traditional RGB-only video lacking depth information, we propose to infer 3D coordinates of the hand joints from RGB data via a powerful architecture that has been primarily introduced in the literature for the task of 3D human pose estimation. We then fuse these estimates with additional SL informative streams, namely 2D skeletal data, as well as convolutional neural network-based hand- and mouth-region representations, and employ an attention-based encoder-decoder for recognition. We evaluate our proposed approach on a corpus of isolated signs of Greek SL and a dataset of continuous finger-spelling in American SL, reporting significant gains by the inclusion of 3D hand pose information, while also outperforming the state-of-the-art on both databases. Further, we evaluate the 3D hand pose estimation technique as standalone.

Details

ISBN :
978-3-030-66095-6
ISBNs :
9783030660956
Database :
OpenAIRE
Journal :
Computer Vision – ECCV 2020 Workshops ISBN: 9783030660956, ECCV Workshops (2)
Accession number :
edsair.doi...........47b0a4f5f94c391e3071be6c943cfdb4
Full Text :
https://doi.org/10.1007/978-3-030-66096-3_18