Back to Search
Start Over
Acoustic event diarization in TV/movie audios using deep embedding and integer linear programming
- Source :
- Multimedia Tools and Applications. 78:33999-34025
- Publication Year :
- 2019
- Publisher :
- Springer Science and Business Media LLC, 2019.
-
Abstract
- In this study, we propose a method for acoustic event diarization based on a feature of deep embedding and a clustering algorithm of integer linear programming. The deep embedding learned by deep auto-encoder network is used to represent the properties of different classes of acoustic events, and then the integer linear programming is adopted for merging audio segments belonging to the same class of acoustic events. Four kinds of TV/movie audios (21.5 h in total) are used as experimental data, including Sport, Situation comedy, Award ceremony, and Action movie. We compare the deep embedding with state-of-the-art features. Further, the clustering algorithm of integer linear programming is compared with other clustering algorithms adopted in previous works. Finally, the proposed method is compared to both supervised and unsupervised methods on four kinds of TV/movie audios. The results show that the proposed method is superior to other unsupervised methods based on agglomerative information bottleneck, Bayesian information criterion and spectral clustering, and is little inferior to the supervised method based on deep neural network in terms of acoustic event error.
- Subjects :
- Artificial neural network
Computer Networks and Communications
business.industry
Computer science
020207 software engineering
Pattern recognition
Information bottleneck method
02 engineering and technology
Spectral clustering
Hierarchical clustering
Hardware and Architecture
Bayesian information criterion
0202 electrical engineering, electronic engineering, information engineering
Media Technology
Embedding
Artificial intelligence
Cluster analysis
business
Integer programming
Software
Subjects
Details
- ISSN :
- 15737721 and 13807501
- Volume :
- 78
- Database :
- OpenAIRE
- Journal :
- Multimedia Tools and Applications
- Accession number :
- edsair.doi...........c0532c774686267bc857338645a7ac6f