Back to Search Start Over

Bilateral Temporal Re-Aggregation for Weakly-Supervised Video Object Segmentation.

Authors :
Lin, Fanchao
Xie, Hongtao
Liu, Chuanbin
Zhang, Yongdong
Source :
IEEE Transactions on Circuits & Systems for Video Technology. Jul2022, Vol. 32 Issue 7, p4498-4512. 15p.
Publication Year :
2022

Abstract

Weakly-supervised video object segmentation is an emerging video task to track and segment the target given a simple bounding box label, which requires the method to fully catch and utilize the target information. Most existing approaches only rely on the guidance of a single frame and ignore the interaction between different frames when gathering information, making them hard to achieve reliable target representation. In this paper, we propose to capture the temporal dependencies and gather information from multiple frames through bilateral temporal re-aggregation. We explore three schemes to build the aggregation: 1) a two-stage re-aggregation mechanism is applied to provide target prior to the current frame, which obtains more valid feature matching and information aggregation; 2) a query-memory bilateral aggregation module is proposed to aggregate features from an unlimited amount of past frames and enable the mutual perception between different frames to validate the gathered information; 3) we guide the learning of aggregation modules through a novel cross-task representation distillation, transferring the knowledge from a semi-supervised model to our weakly-supervised model without increasing the inference latency. These schemes collaboratively build an efficient and competent aggregation process, thus we can fully exploit the video context to make the inference. Experimental results on four benchmarks show that our method achieves superior performance than previous methods and still maintains the efficiency ($e.g$., overall scores of 70.4% and 72.5% on the YouTube-VOS and DAVIS 2017 validation sets, respectively). [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10518215
Volume :
32
Issue :
7
Database :
Academic Search Index
Journal :
IEEE Transactions on Circuits & Systems for Video Technology
Publication Type :
Academic Journal
Accession number :
157765752
Full Text :
https://doi.org/10.1109/TCSVT.2021.3127562