Back to Search Start Over

Investigations on speaker adaptation using a continuous vocoder within recurrent neural network based text-to-speech synthesis.

Authors :
Mandeel, Ali Raheem
Al-Radhi, Mohammed Salah
Csapó, Tamás Gábor
Source :
Multimedia Tools & Applications; Apr2023, Vol. 82 Issue 10, p15635-15649, 15p
Publication Year :
2023

Abstract

This paper presents an investigation of speaker adaptation using a continuous vocoder for parametric text-to-speech (TTS) synthesis. In purposes that demand low computational complexity, conventional vocoder-based statistical parametric speech synthesis can be preferable. While capable of remarkable naturalness, recent neural vocoders nonetheless fall short of the criteria for real-time synthesis. We investigate our former continuous vocoder, in which the excitation is characterized employing two one-dimensional parameters: Maximum Voiced Frequency and continuous fundamental frequency (F0). We show that an average voice can be trained for deep neural network-based TTS utilizing data from nine English speakers. We did speaker adaptation experiments for each target speaker with 400 utterances (approximately 14 minutes). We showed an apparent enhancement in the quality and naturalness of synthesized speech compared to our previous work by utilizing the recurrent neural network topologies. According to the objective studies (Mel-Cepstral Distortion and F0 correlation), the quality of speaker adaptation using Continuous Vocoder-based DNN-TTS is slightly better than the WORLD Vocoder-based baseline. The subjective MUSHRA-like test results also showed that our speaker adaptation technique is almost as natural as the WORLD vocoder using Gated Recurrent Unit and Long Short Term Memory networks. The proposed vocoder, being capable of real-time synthesis, can be used for applications which need fast synthesis speed. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13807501
Volume :
82
Issue :
10
Database :
Complementary Index
Journal :
Multimedia Tools & Applications
Publication Type :
Academic Journal
Accession number :
162683393
Full Text :
https://doi.org/10.1007/s11042-022-14005-5