Back to Search Start Over

Enhancement of esophageal speech obtained by a voice conversion technique using time dilated Fourier cepstra.

Authors :
Ben Othmane, Imen
Di Martino, Joseph
Ouni, Kaïs
Source :
International Journal of Speech Technology; Mar2019, Vol. 22 Issue 1, p99-110, 12p
Publication Year :
2019

Abstract

This paper presents a novel speaking-aid system for enhancing esophageal speech (ES). The method adopted in this paper aims to improve the quality of esophageal speech using a combination of a voice conversion technique and a time dilation algorithm. In the proposed system, a Deep Neural Network (DNN) is used as a nonlinear mapping function for vocal tract vector transformation. Then the converted frames are used to determine realistic excitation and phase vectors from the target training space using a frame selection algorithm. Next, in order to preserve speaker identity of the esophageal speakers, we use the source vocal tract features and propose to apply on them a time dilation algorithm to reduce the unpleasant esophageal noises. Finally the converted speech is reconstructed using the dilated source vocal tract frames and the predicted excitation and phase. DNN and Gaussian mixture model (GMM) based voice conversion systems have been evaluated using objective and subjective measures. Such an experimental study has been realized also in order to evaluate the changes in speech quality and intelligibility of the transformed signals. Experimental results demonstrate that the proposed methods provide considerable improvement in intelligibility and naturalness of the converted esophageal speech. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13812416
Volume :
22
Issue :
1
Database :
Complementary Index
Journal :
International Journal of Speech Technology
Publication Type :
Academic Journal
Accession number :
134970177
Full Text :
https://doi.org/10.1007/s10772-018-09579-1