Back to Search
Start Over
Understanding the limits of 2D skeletons for action recognition
- Source :
- Multimedia Systems. 27:547-561
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- With the development of motion capture technologies, 3D action recognition has become a popular task that finds great applicability in many areas, such as augmented reality, human–computer interaction, sports, or healthcare. On the other hand, the acquisition of 3D human skeleton data is an expensive and time-consuming process, mainly due to the high costs of capturing technologies and the absence of suitable actors. We overcome these issues by focusing on the 2D skeleton modality that can be easily extracted from ordinary videos. The objective of this work is to demonstrate a high descriptive power of such a 2D skeleton modality by achieving accuracy on the task of daily action recognition competitive to 3D skeleton data. More importantly, we thoroughly analyze the factors that significantly influence the 2D recognition accuracy, such as the sensitivity towards data normalization, scaling, quantization, and 3D-to-2D distortions in skeleton orientations and sizes, which are caused by the loss of depth dimension and fixed-angle camera view. We also provide valuable insights on how to mitigate these problems to increase recognition accuracy significantly. The experimental evaluation is conducted on three datasets different in nature. The ability to learn different types of actions better using either 2D or 3D skeletons is also reported. Throughout experiments, a generic light-weight LSTM network is used, whose architecture can be easily tuned to achieve the desired trade-off between its accuracy and efficiency. We show that the proposed approach achieves not only the state-of-the-art results in 2D skeleton action recognition but is also highly competitive to the best-performing methods classifying 3D skeleton sequences or the visual content extracted from ordinary videos.
- Subjects :
- Computer Networks and Communications
Computer science
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
02 engineering and technology
Skeleton (category theory)
Machine learning
computer.software_genre
Motion capture
Computer graphics
Database normalization
0202 electrical engineering, electronic engineering, information engineering
Media Technology
medicine
Quantization (image processing)
Modality (human–computer interaction)
business.industry
020207 software engineering
Human skeleton
medicine.anatomical_structure
Hardware and Architecture
020201 artificial intelligence & image processing
Augmented reality
Artificial intelligence
business
computer
Software
Information Systems
Subjects
Details
- ISSN :
- 14321882 and 09424962
- Volume :
- 27
- Database :
- OpenAIRE
- Journal :
- Multimedia Systems
- Accession number :
- edsair.doi...........09a469b255658b57b98f2242f1496b17
- Full Text :
- https://doi.org/10.1007/s00530-021-00754-0