Back to Search Start Over

MLS-Track: Multilevel Semantic Interaction in RMOT

Authors :
Ma, Zeliang
Yang, Song
Cui, Zhe
Zhao, Zhicheng
Su, Fei
Liu, Delong
Wang, Jingyu
Publication Year :
2024

Abstract

The new trend in multi-object tracking task is to track objects of interest using natural language. However, the scarcity of paired prompt-instance data hinders its progress. To address this challenge, we propose a high-quality yet low-cost data generation method base on Unreal Engine 5 and construct a brand-new benchmark dataset, named Refer-UE-City, which primarily includes scenes from intersection surveillance videos, detailing the appearance and actions of people and vehicles. Specifically, it provides 14 videos with a total of 714 expressions, and is comparable in scale to the Refer-KITTI dataset. Additionally, we propose a multi-level semantic-guided multi-object framework called MLS-Track, where the interaction between the model and text is enhanced layer by layer through the introduction of Semantic Guidance Module (SGM) and Semantic Correlation Branch (SCB). Extensive experiments on Refer-UE-City and Refer-KITTI datasets demonstrate the effectiveness of our proposed framework and it achieves state-of-the-art performance. Code and datatsets will be available.<br />Comment: 17 pages 8 figures

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2404.12031
Document Type :
Working Paper