Back to Search Start Over

Multi-scale spatio-temporal feature adaptive aggregation for video-based Person Re-identification.

Authors :
Zhao, Wei
Huang, Yan
Wang, Guoyou
Zhang, Bo
Gao, Yuhang
Liu, Yuze
Source :
Knowledge-Based Systems. Sep2024, Vol. 299, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Video-based person re-identification (Re -ID) aims to retrieve the target person from video sequences captured by a distributed camera system. It remains a challenging task due to the reasons such as occlusion and misalignment in the video. To address above problem, many methods are proposed to exploit multi-scale spatio-temporal features in videos. However, established methods typically assign the equal weights to temporal or spatial features at different scales, which significantly diminishes the distinct roles of each feature. In this paper, we propose a novel Multi-scale Feature Aggregation Network (MFANet) for video-based person Re -ID. Specifically, we propose two flexible modules, Multi-scale Temporal Feature Aggregation (MTFA) and Multi-scale Spatial Feature Aggregation (MSFA). These two modules first extract different scales of temporal (dynamic, static) and spatial (coarse and fine) features, and then adaptively assign weights to each feature according to the video sequence. Both of these lightweight modules can be incorporated with 3D Convolutional Neural Network to build our MFANet. Extensive experiments on four public benchmarks demonstrate that MTFA and MSFA improve the performance of baseline architectures, and our MFANet achieves the best performance compared to other state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09507051
Volume :
299
Database :
Academic Search Index
Journal :
Knowledge-Based Systems
Publication Type :
Academic Journal
Accession number :
178884575
Full Text :
https://doi.org/10.1016/j.knosys.2024.111980