Back to Search
Start Over
Perception of Synthetic and Natural Speech by Adults with Visual Impairments
- Source :
- Scopus-Elsevier
- Publication Year :
- 2009
- Publisher :
- SAGE Publications, 2009.
-
Abstract
- This study investigated the intelligibility and comprehensibility of natural speech in comparison to synthetic speech. The results demonstrate the type of errors; the relationship between intelligibility and comprehensibility; and the correlation between intelligibility and comprehensibility and key factors, such as the frequency of use of text-to-speech systems. ********** Recent technological progress in electronic augmentative communication devices has expanded the possibilities for communication for people who previously faced severe communication difficulties, such as individuals with visual impairments. As Mirenda and Beukelman (1990) noted, the most common method used to generate synthetic speech in modern communication devices is text-to-speech (TTS) systems. TTS systems use a flexible mathematic algorithm that represents rules for combining acoustic properties and rules for pronunciation (Mirenda & Beukelman, 1990). The perception of synthetic speech is usually discussed with regard to intelligibility and comprehension (Koul, 2003). Intelligibility is the listener's ability to recognize phonemes and words when they are presented in isolation (Ralston, Pisoni, & Mullennix, 1989), whereas comprehension involves the extraction of the underlying meaning from the acoustic signals of speech (Duffy & Pisoni, 1992). The comprehension of synthetic speech involves recognizing the stimuli presented and then performing higher-level processing to obtain meaning. The term discourse comprehension is an automatic process that is used to encode and integrate passages, explanations, and conversations (Higginbotham, Drazek, Kowarsky, Scally, & Segal, 1994). Despite the substantial data on the intelligibility and comprehension of synthetic speech systems by people with no disabilities (see, for example, Koul, 2003; Koul & Hanners, 1997; Mirenda & Beukelman, 1987, 1990), there has been limited research on the intelligibility and comprehension of synthetic speech systems by people with visual impairments (see, for example, Hensil & Whittaker, 2000). Numerous studies have found that natural speech is significantly more intelligible than that produced by TTS synthesis systems (Clark, 1983; Greene, Logan, & Pisoni, 1986; Hoover, Reichle, VanTasell, & Cole, 1987; Kangas & Allen, 1990; Koul & Allen, 1993; Logan, Greene, & Pisoni, 1989; Mirenda & Beukelman, 1987, 1990; Mitchell & Atkins, 1989; Ralston, Pisoni, Lively, Greene, & Mullennix, 1991). For example, the percentage of intelligibility for high-quality synthesizers, such as DECtalk (a high-quality form of synthetic speech manufactured by Digital Equipment Corporation) in a single-word intelligibility task ranged from 81.7% correct with an open-response format (Mirenda & Beukelman, 1987) to 96.7% correct with a closed-response format (Greene, Manous, & Pisoni, 1984, cited in Koul & Hanners, 1997). In contrast, word-intelligibility scores for natural speech ranged from 97.2% correct with an open-response format to 99% correct with a closed-response format (Logan et al., 1989). Moreover, a review of relevant research on the perception of sentences produced by synthetic speech revealed a differentiated pattern of results, depending on the type of sentence spoken. According to Mirenda and Beukelman (1987), accuracy scores ranged from 96.7% for sentences presented via the DECtalk synthesizer to 99.3% for meaningful sentences presented via natural speech. However, for anomalous sentences, accuracy scores ranged from 78.7% for synthetic speech to 97.7% for natural speech (Pisoni & Hunnicutt, 1980, cited in Koul, 2003). Thus, similar to the trend for words, there is remarkably greater intelligibility for natural speech than for TTS speech for sentences. Furthermore, comprehension of sentences and narratives has been found to be slower and less accurate when materials are presented in synthetic rather than in natural speech (Higginbotham et al. …
- Subjects :
- 030506 rehabilitation
media_common.quotation_subject
Audio equipment
Speech recognition
05 social sciences
Rehabilitation
Pronunciation
Intelligibility (communication)
Linguistics
Comprehension
03 medical and health sciences
Ophthalmology
Augmentative and alternative communication
Perception
0501 psychology and cognitive sciences
0305 other medical science
Psychology
050107 human factors
Augmentative
Sentence
media_common
Subjects
Details
- ISSN :
- 15591476 and 0145482X
- Volume :
- 103
- Database :
- OpenAIRE
- Journal :
- Journal of Visual Impairment & Blindness
- Accession number :
- edsair.doi.dedup.....0ec3444aa43e3178ff577e1759aed6fa