1. Collaborative multimodal feature learning for RGB-D action recognition.
- Author
-
Kong, Jun, Liu, Tianshan, and Jiang, Min
- Subjects
- *
HUMAN activity recognition , *MACHINE learning , *HUMAN behavior - Abstract
Highlights • Our CMFL model jointly learns shared-specific features and action classifiers. • The proposed RSTPF features extract dynamic local patterns around each human joint. • The CMFL model enables the features to be optimized for classification. • The CMFL performs well even if one or two modalities are missing in the testing stage. • A max-margin framework is introduced to fuse skeleton, depth and RGB data. Abstract The emergence of cost-effective depth sensors opens up a new dimension for RGB-D based human action recognition. In this paper, we propose a collaborative multimodal feature learning (CMFL) model for human action recognition from RGB-D sequences. Specifically, we propose a robust spatio-temporal pyramid feature (RSTPF) to capture dynamic local patterns around each human joint. The proposed CMFL model fuses multimodal data (skeleton, depth and RGB), and learns action classifiers using the fused features. The original low-level feature matrices are factorized to learn shared features and modality-specific features under a supervised fashion. The shared features describe the common structures among the three modalities while the modality-specific features capture intrinsic information of each modality. We formulate shared-specific features mining and action classifiers learning in a unified max-margin framework, and solve the formulation using an iterative optimization algorithm. Experimental results on four action datasets demonstrate the efficacy of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF