Back to Search Start Over

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

Authors :
Chen, Guo
Xing, Sen
Chen, Zhe
Wang, Yi
Li, Kunchang
Li, Yizhuo
Liu, Yi
Wang, Jiahao
Zheng, Yin-Dong
Huang, Bingkun
Zhao, Zhiyu
Pan, Junting
Huang, Yifei
Wang, Zun
Yu, Jiashuo
He, Yinan
Zhang, Hongjie
Lu, Tong
Wang, Yali
Wang, Limin
Qiao, Yu
Publication Year :
2022

Abstract

In this report, we present our champion solutions to five tracks at Ego4D challenge. We leverage our developed InternVideo, a video foundation model, for five Ego4D tasks, including Moment Queries, Natural Language Queries, Future Hand Prediction, State Change Object Detection, and Short-term Object Interaction Anticipation. InternVideo-Ego4D is an effective paradigm to adapt the strong foundation model to the downstream ego-centric video understanding tasks with simple head designs. In these five tasks, the performance of InternVideo-Ego4D comprehensively surpasses the baseline methods and the champions of CVPR2022, demonstrating the powerful representation ability of InternVideo as a video foundation model. Our code will be released at https://github.com/OpenGVLab/ego4d-eccv2022-solutions<br />Comment: Technical report in 2nd International Ego4D Workshop@ECCV 2022. Code will be released at https://github.com/OpenGVLab/ego4d-eccv2022-solutions

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2211.09529
Document Type :
Working Paper