Back to Search
Start Over
Music detection from broadcast contents using convolutional neural networks with a Mel-scale kernel
- Source :
- EURASIP Journal on Audio, Speech, and Music Processing, Vol 2019, Iss 1, Pp 1-12 (2019)
- Publication Year :
- 2019
- Publisher :
- SpringerOpen, 2019.
-
Abstract
- Abstract We propose a new method for music detection from broadcasting contents using the convolutional neural networks with a Mel-scale kernel. In this detection task, music segments should be annotated from the broadcast data, where music, speech, and noise are mixed. The convolutional neural network is composed of a convolutional layer with kernel that is trained to extract robust features. The Mel-scale changes the kernel size, and the backpropagation algorithm trains the kernel shape. We used 52 h of mixed broadcast data (25 h of music) to train the convolutional network and 24 h of collected broadcast data (ratio of music of 50–76%) for testing. The test data consisted of various genres (drama, documentary, news, kids, reality, and so on) that are broadcast in British English, Spanish, and Korean languages. The proposed method consistently showed better performance in all the three languages than the baseline system, and the F-score ranged from 86.5% for British data to 95.9% for Korean drama data. Our music detection system takes about 28 s to process a 1-min signal using only one CPU with 4 cores.
Details
- Language :
- English
- ISSN :
- 16874722
- Volume :
- 2019
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- EURASIP Journal on Audio, Speech, and Music Processing
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.2c0e3ab9d1f482c911485be849b7387
- Document Type :
- article
- Full Text :
- https://doi.org/10.1186/s13636-019-0155-y