Back to Search
Start Over
Investigating automatic & human filled pause insertion for speech synthesis
- Source :
- INTERSPEECH, Dall, R, Tomalin, M, Wester, M, Byrne, W & King, S 2014, Investigating Automatic & Human Filled Pause Insertion for Speech Synthesis . in Proc. Interspeech .
- Publication Year :
- 2014
- Publisher :
- ISCA, 2014.
-
Abstract
- Filled pauses are pervasive in conversational speech and have been shown to serve several psychological and structural purposes. Despite this, they are seldom modelled overtly by stateof-the-art speech synthesis systems. This paper seeks to motivate the incorporation of filled pauses into speech synthesis systems by exploring their use in conversational speech, and by comparing the performance of several automatic systems inserting filled pauses into fluent text. Two initial experiments are described which seek to determine whether people’s predicted insertion points are consistent with actual practice and/or with each other. The experiments also investigate whether there are ‘right’ and ‘wrong’ places to insert filled pauses. The results show good consistency between people’s predictions of usage and their actual practice, as well as a perceptual preference for the ‘right’ placement. The third experiment contrasts the performance of several automatic systems that insert filled pauses into fluent sentences. The best performance (determined by F-score) was achieved through the by-word interpolation of probabilities predicted by Recurrent Neural Network and 4gram Language Models. The results offer insights into the use and perception of filled pauses by humans, and how automatic systems can be used to predict their locations. Index Terms: filled pause, HMM TTS, SVM, RNN
- Subjects :
- Conversational speech
business.industry
Computer science
Speech recognition
media_common.quotation_subject
Speech synthesis
computer.software_genre
Support vector machine
Consistency (database systems)
Recurrent neural network
Perception
Language model
Artificial intelligence
business
Hidden Markov model
computer
Natural language processing
media_common
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Interspeech 2014
- Accession number :
- edsair.doi.dedup.....2b83da553a5dfce6b44d7b5895a5abb2
- Full Text :
- https://doi.org/10.21437/interspeech.2014-11