1. MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation
- Author
-
Yizhou Wang, Chunyu Wang, and Rongchang Xie
- Subjects
FOS: Computer and information sciences ,Fusion ,Scale (ratio) ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Margin (machine learning) ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Adaptation (computer science) ,Pose ,0105 earth and related environmental sciences - Abstract
Cross view feature fusion is the key to address the occlusion problem in human pose estimation. The current fusion methods need to train a separate model for every pair of cameras making them difficult to scale. In this work, we introduce MetaFuse, a pre-trained fusion model learned from a large number of cameras in the Panoptic dataset. The model can be efficiently adapted or finetuned for a new pair of cameras using a small number of labeled images. The strong adaptation power of MetaFuse is due in large part to the proposed factorization of the original fusion model into two parts (1) a generic fusion model shared by all cameras, and (2) lightweight camera-dependent transformations. Furthermore, the generic model is learned from many cameras by a meta-learning style algorithm to maximize its adaptation capability to various camera poses. We observe in experiments that MetaFuse finetuned on the public datasets outperforms the state-of-the-arts by a large margin which validates its value in practice., Comment: Accepted to CVPR2020
- Published
- 2020
- Full Text
- View/download PDF