Back to Search
Start Over
Weakly-supervised video object localization with attentive spatio-temporal correlation
- Source :
- Pattern Recognition Letters. 145:232-239
- Publication Year :
- 2021
- Publisher :
- Elsevier BV, 2021.
-
Abstract
- Weakly-supervised video object localization is a challenging yet important task. The system should spatially localize the object of interest in videos, where only the descriptive sentences and their corresponding video segments are given in the training stage. Recent efforts propose to apply image-based Multiple Instance Learning (MIL) theory in this video task, and propagate the supervision from the video into frames by applying different frame-weighting strategies. Despite their promising progress, the spatio-temporal correlation between different object regions in videos has been largely ignored. To fill the research gap, in this work we introduce a simple but effective feature expression and aggregation framework, which utilizes the self-attention mechanism to capture the latent spatio-temporal correlation between multimodal object features and design a multimodal interaction module to model the similarity between the semantic query in sentences and the object regions in videos. We conduct extensive experimental evaluation on the YouCookII and ActivityNet-Entities datasets, which demonstrates significant improvements over multiple competitive baselines.
- Subjects :
- Semantic query
Similarity (geometry)
business.industry
Computer science
Pattern recognition
02 engineering and technology
Object (computer science)
01 natural sciences
Expression (mathematics)
Multimodal interaction
Task (project management)
Image (mathematics)
Artificial Intelligence
Feature (computer vision)
0103 physical sciences
Signal Processing
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Computer Vision and Pattern Recognition
Artificial intelligence
010306 general physics
business
Software
Subjects
Details
- ISSN :
- 01678655
- Volume :
- 145
- Database :
- OpenAIRE
- Journal :
- Pattern Recognition Letters
- Accession number :
- edsair.doi...........e932f3afa0c8b49d7cf8d57518f81091