Back to Search Start Over

Acoustic data-driven pronunciation lexicon generation for logographic languages

Authors :
Guoguo Chen
Sanjeev Khudanpur
Daniel Povey
Source :
ICASSP
Publication Year :
2016
Publisher :
IEEE, 2016.

Abstract

Handcrafted pronunciation lexicons are widely used in modern speech recognition systems. Designing a pronunciation lexicon, however, requires tremendous amount of expert knowledge and effort, which is not practical when applying speech recognition techniques to low resource languages. In this paper, we are interested in developing speech recognition systems for logographic languages with only a small expert pronunciation lexicon. An iterative framework is proposed to generate and refine the phonetic transcripts of the training data, which will then be aligned to their word-level transcripts for grapheme-to-phoneme (G2P) model training. The G2P model trained this way covers graphemes that appear in the training transcripts (most of which are usually unseen in a small expert lexicon for logographic languages), therefore is able to generate pronunciations for all the words in the transcripts. The proposed lexicon generation procedure is evaluated on Cantonese speech recognition and keyword search tasks. Experiments show that starting from an expert lexicon of only 1K words, we are able to generate a lexicon that works reasonably well when compared with an expert-crafted lexicon of 5K words.

Details

Database :
OpenAIRE
Journal :
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Accession number :
edsair.doi...........d0d6c299e5adc9d99a8c6d89242d8e7d
Full Text :
https://doi.org/10.1109/icassp.2016.7472699