Back to Search Start Over

Acoustic event diarization in TV/movie audios using deep embedding and integer linear programming

Authors :
Mingle Liu
Jichen Yang
Yanxiong Li
Yuhan Zhang
Xianku Li
Wang Wucheng
Source :
Multimedia Tools and Applications. 78:33999-34025
Publication Year :
2019
Publisher :
Springer Science and Business Media LLC, 2019.

Abstract

In this study, we propose a method for acoustic event diarization based on a feature of deep embedding and a clustering algorithm of integer linear programming. The deep embedding learned by deep auto-encoder network is used to represent the properties of different classes of acoustic events, and then the integer linear programming is adopted for merging audio segments belonging to the same class of acoustic events. Four kinds of TV/movie audios (21.5 h in total) are used as experimental data, including Sport, Situation comedy, Award ceremony, and Action movie. We compare the deep embedding with state-of-the-art features. Further, the clustering algorithm of integer linear programming is compared with other clustering algorithms adopted in previous works. Finally, the proposed method is compared to both supervised and unsupervised methods on four kinds of TV/movie audios. The results show that the proposed method is superior to other unsupervised methods based on agglomerative information bottleneck, Bayesian information criterion and spectral clustering, and is little inferior to the supervised method based on deep neural network in terms of acoustic event error.

Details

ISSN :
15737721 and 13807501
Volume :
78
Database :
OpenAIRE
Journal :
Multimedia Tools and Applications
Accession number :
edsair.doi...........c0532c774686267bc857338645a7ac6f