Start Over

Enhancing the Intelligibility of Statistically Generated Synthetic Speech by Means of Noise-Independent Modifications

Authors :: Yannis Stylianou
Daniel Erro
Tudor-Catalin Zorila
Source :: IEEE/ACM Transactions on Audio, Speech, and Language Processing. 22:2101-2111
Publication Year :: 2014
Publisher :: Institute of Electrical and Electronics Engineers (IEEE), 2014.
Abstract: When speaking devices such as smartphones, tablet-PCs, or GPS systems are used in noisy outdoor environments, the intelligibility of speech significantly drops. This is even more pronounced when synthetic speech is used. This article describes how a statistical parametric speech synthesis system trained on an ordinary synthesis database can be designed to generate highly intelligible speech, even at very low signal-to-noise ratios. By using a simple and flexible vocoder based on a full-band harmonic model, the proposed system applies deterministic noise-independent modifications at several levels: speaking rate, average fundamental frequency level and range, energy contour over time, formant sharpness, and intensity of specific spectral bands. The degree of intelligibility achieved by the system has been evaluated by means of a large-scale subjective test, the results of which show that the suggested approach clearly outperforms a reference state-of-the-art TTS system and also unmodified natural speech in some conditions. In comparison with alternative systems evaluated in the same framework, the proposed one exhibits the best performance in the scenarios with lowest signal-to-noise ratio. Finally, the impact of the suggested modifications on naturalness, quality and similarity to the original natural voice is quantified by means of a subjective test.

Subjects :: Voice activity detection
Acoustics and Ultrasonics
Computer science
Speech recognition
Speech coding
Speech synthesis
PSQM
Intelligibility (communication)
Linear predictive coding
computer.software_genre
Speech processing
Speech enhancement
Computational Mathematics
Computer Science (miscellaneous)
Electrical and Electronic Engineering
computer

Details

ISSN :: 23299304 and 23299290
Volume :: 22
Database :: OpenAIRE
Journal :: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Accession number :: edsair.doi...........d24dae59535bd999fae4b0265fc254c2
Full Text :: https://doi.org/10.1109/taslp.2014.2361022

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Enhancing the Intelligibility of Statistically Generated Synthetic Speech by Means of Noise-Independent Modifications

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Enhancing the Intelligibility of Statistically Generated Synthetic Speech by Means of Noise-Independent Modifications

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources