Back to Search Start Over

Attention U-Net based on multi-scale feature extraction and WSDAN data augmentation for video anomaly detection.

Authors :
Lei, Shanzhong
Song, Junfang
Wang, Tengjiao
Wang, Fangxin
Yan, Zhuyang
Source :
Multimedia Systems. Jun2024, Vol. 30 Issue 3, p1-20. 20p.
Publication Year :
2024

Abstract

The widespread adoption of video surveillance systems in public security and network security domains has underscored the importance of video anomaly detection as a pivotal research area. To enhance the precision and robustness of anomaly detection, this manuscript introduces an innovative method for video anomaly detection. The approach begins with the application of multi-scale feature extraction technology to capture visual information across varying scales in video data. Leveraging the Spatial Pyramid Convolution (SPC) module as the cornerstone for multi-scale feature learning, the study addresses the impact of scale variations, thereby augmenting the model’s detection capabilities across different scales. Furthermore, a Weakly Supervised Data Augmentation Network (WSDAN) module is incorporated to facilitate attention-guided data augmentation, enhancing the richness of input images. These augmented images undergo training with the U-Net network to elevate detection accuracy. Additionally, the integration of the improved Convolutional Block Attention Module (CBAM) into the base U-Net architecture enables end-to-end training. CBAM dynamically adjusts feature map weights, allowing the model to concentrate on anomaly relevant regions in the video while suppressing interference from non-anomalous areas. To assess anomalies, the paper employs the Peak Signal-to-Noise Ratio (PSNR) between predicted and original frames, normalizing PSNR values for anomaly identification. The proposed method is then evaluated using publicly available datasets CUHK Avenue and UCSD Ped2, with results visually presented. Experimental findings showcase Area Under the Receiver-Operating Characteristic Curve (AUC) values of 86.2% and 97.9% for these datasets, surpassing comparative methods and confirming the effectiveness and superiority of the proposed approach. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09424962
Volume :
30
Issue :
3
Database :
Academic Search Index
Journal :
Multimedia Systems
Publication Type :
Academic Journal
Accession number :
176525207
Full Text :
https://doi.org/10.1007/s00530-024-01320-0