Start Over

AVMSN: An Audio-Visual Two Stream Crowd Counting Framework Under Low-Quality Conditions

Authors :: Yongqian Xu
Chen Jiaqi
Ruihan Hu
Hongjian Zhou
Qinglong Mo
Yuanfei Xie
Yalun Yang
Edmond Q. Wu
Zhiri Tang
Source :: IEEE Access, Vol 9, Pp 80500-80510 (2021)
Publication Year :: 2021
Publisher :: IEEE, 2021.
Abstract: Crowd counting is considered as the essential computer vision application that uses the convolutional neural network to model the crowd density as the regression task. However, the vision-based models are hard to extract the feature under low-quality conditions. As we know, visual and audio are used widely as media platforms for human beings to touch the physical change of the world. The cross-modal information gives us an alternative method of solving the crowd counting task. In this case, in order to solve this problem, a model named the Audio-Visual Multi-Scale Network (AVMSN) is established to model the unconstrained visual and audio sources for completing the crowd counting task in this paper. Based on the Feature extraction and Multi-modal fusion module, in order to handle the objects of various sizes in the crowd scene, the Sample Convolutional Blocks are adopted by the AVMSN as the multi-scale Vision-end branch in the Feature extraction module to calculate the weighted-visual feature. Besides, the audio, which is the temporal domain transformed into the spectrogram information and the audio feature is learned by the audio-VGG network. Finally, the weighted-visual and audio features are fused by the Multi-modal fusion module, which adopts the cascade fusion architecture to calculate the estimated density map. The experimental results show the proposed AVMSN achieves a lower mean absolute error than other state-of-art crowd counting models under the low-quality conditions.

Details

Language :: English
ISSN :: 21693536
Volume :: 9
Database :: OpenAIRE
Journal :: IEEE Access
Accession number :: edsair.doi.dedup.....9ede3c2252cbc8de5aeb6e00b22cc3b1

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

AVMSN: An Audio-Visual Two Stream Crowd Counting Framework Under Low-Quality Conditions

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

AVMSN: An Audio-Visual Two Stream Crowd Counting Framework Under Low-Quality Conditions

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources