Back to Search
Start Over
Thousands of voices for HMM-based speech synthesis
- Source :
- Yamagishi, J, Usabaev, B, King, S, Watts, O, Dines, J, Tian, J, Hu, R, Guan, Y, Oura, K, Tokuda, K, Karhila, R & Kurimo, M 2009, Thousands of voices for HMM-based speech synthesis . in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 : 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009; Brighton, United Kingdom . pp. 420-423 ., Scopus-Elsevier
- Publication Year :
- 2009
- Publisher :
- ISCA, 2009.
-
Abstract
- Our recent experiments with HMM-based speech synthesis systems have demonstrated that speaker-adaptive HMM-based speech synthesis (which uses an 'average voice model' plus model adaptation) is robust to non-ideal speech data that are recorded under various conditions and with varying microphones, that are not perfectly clean, and/or that lack of phonetic balance. This enables us consider building high-quality voices on 'non-TTS' corpora such as ASR corpora. Since ASR corpora generally include a large number of speakers, this leads to the possibility of producing an enormous number of voices automatically. In this paper we show thousands of voices for HMM-based speech synthesis that we have made from several popular ASR corpora such as the Wall Street Journal databases (WSJ0/WSJ1/WSJCAM0), Resource Management, Globalphone and Speecon. We report some perceptual evaluation results and outline the outstanding issues.
- Subjects :
- ComputingMethodologies_PATTERNRECOGNITION
05 social sciences
0202 electrical engineering, electronic engineering, information engineering
020207 software engineering
0501 psychology and cognitive sciences
02 engineering and technology
ComputingMethodologies_ARTIFICIALINTELLIGENCE
050107 human factors
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Interspeech 2009
- Accession number :
- edsair.doi.dedup.....5e6b2f5b8c9337711949ff09249f73e0