Back to Search
Start Over
Foreign Accent Conversion Through Concatenative Synthesis in the Articulatory Domain
- Source :
- IEEE Transactions on Audio, Speech, and Language Processing. 20:2301-2312
- Publication Year :
- 2012
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2012.
-
Abstract
- We propose a concatenative synthesis approach to the problem of foreign accent conversion. The approach consists of replacing the most accented portions of nonnative speech with alternative segments from a corpus of the speaker's own speech based on their similarity to those from a reference native speaker. We propose and compare two approaches for selecting units, one based on acoustic similarity [e.g., mel frequency cepstral coefficients (MFCCs)] and a second one based on articulatory similarity, as measured through electromagnetic articulography (EMA). Our hypothesis is that articulatory features provide a better metric for linguistic similarity across speakers than acoustic features. To test this hypothesis, we recorded an articulatory-acoustic corpus from a native and a nonnative speaker, and evaluated the two speech representations (acoustic versus articulatory) through a series of perceptual experiments. Formal listening tests indicate that the approach can achieve a 20% reduction in perceived accent, but also reveal a strong coupling between accent and speaker identity. To address this issue, we disguised original and resynthesized utterances by altering their average pitch and normalizing vocal tract length. An additional listening experiment supports the hypothesis that articulatory features are less speaker dependent than acoustic features.
- Subjects :
- Acoustics and Ultrasonics
business.industry
Computer science
Speech recognition
computer.software_genre
Speaker recognition
Speaker diarisation
Similarity (network science)
Stress (linguistics)
Active listening
Artificial intelligence
Mel-frequency cepstrum
Electrical and Electronic Engineering
Concatenative synthesis
business
computer
Natural language processing
Vocal tract
Subjects
Details
- ISSN :
- 15587924 and 15587916
- Volume :
- 20
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Audio, Speech, and Language Processing
- Accession number :
- edsair.doi...........2223887e827943f46539cb4ebdfed7a2