Back to Search
Start Over
Recognition of phonemes and words in singing
- Source :
- ICASSP
- Publication Year :
- 2010
- Publisher :
- IEEE, 2010.
-
Abstract
- This paper studies the influence of n-gram language models in the recognition of sung phonemes and words. We train uni-, bi-, and trigram language models for phonemes and bi- and trigrams for words. The word-level language model is estimated from a textual lyrics database. In the recognition we use a hidden Markov model based phonetic recognizer adapted to singing voice. The models were tested on monophonic singing and on vocal lines separated from polyphonic music. On clean singing the phoneme recognition accuracies varied from 20% (no language model) to 39% (bigram) and on polyphonic music from 6% (no language model) to 20% (bigram). In word recognition, one fifth of the words were recognized in clean singing, the performance being lower on polyphonic music. We study the use of the recognition results in a query-by-singing application. Using the recognized words, we retrieve the songs by searching for the text in a text lyrics database. For the word recognition system having only 24% correct recognition rate, the first retrieved song is correct in 57% of the test cases.
- Subjects :
- business.industry
Computer science
Speech recognition
Bigram
computer.software_genre
Lyrics
Musical acoustics
Rule-based machine translation
Word recognition
Music information retrieval
Trigram
Polyphony
Artificial intelligence
Language model
Singing
business
Hidden Markov model
computer
Natural language processing
Natural language
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- Accession number :
- edsair.doi...........dcf8ff2a5482f2026afe655ee89cc92a
- Full Text :
- https://doi.org/10.1109/icassp.2010.5495585