Back to Search Start Over

Music detection from broadcast contents using convolutional neural networks with a Mel-scale kernel

Authors :
Byeong-Yong Jang
Woon-Haeng Heo
Jung-Hyun Kim
Oh-Wook Kwon
Source :
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2019, Iss 1, Pp 1-12 (2019)
Publication Year :
2019
Publisher :
SpringerOpen, 2019.

Abstract

Abstract We propose a new method for music detection from broadcasting contents using the convolutional neural networks with a Mel-scale kernel. In this detection task, music segments should be annotated from the broadcast data, where music, speech, and noise are mixed. The convolutional neural network is composed of a convolutional layer with kernel that is trained to extract robust features. The Mel-scale changes the kernel size, and the backpropagation algorithm trains the kernel shape. We used 52 h of mixed broadcast data (25 h of music) to train the convolutional network and 24 h of collected broadcast data (ratio of music of 50–76%) for testing. The test data consisted of various genres (drama, documentary, news, kids, reality, and so on) that are broadcast in British English, Spanish, and Korean languages. The proposed method consistently showed better performance in all the three languages than the baseline system, and the F-score ranged from 86.5% for British data to 95.9% for Korean drama data. Our music detection system takes about 28 s to process a 1-min signal using only one CPU with 4 cores.

Details

Language :
English
ISSN :
16874722
Volume :
2019
Issue :
1
Database :
Directory of Open Access Journals
Journal :
EURASIP Journal on Audio, Speech, and Music Processing
Publication Type :
Academic Journal
Accession number :
edsdoj.2c0e3ab9d1f482c911485be849b7387
Document Type :
article
Full Text :
https://doi.org/10.1186/s13636-019-0155-y