1. Restricted Boltzmann Machine-Based Voice Conversion for Nonparallel Corpus
- Author
-
Ki-Seung Lee
- Subjects
Restricted Boltzmann machine ,Training set ,business.industry ,Computer science ,Applied Mathematics ,Speech recognition ,Feature extraction ,020206 networking & telecommunications ,Pattern recognition ,Speech corpus ,Probability density function ,02 engineering and technology ,Conditional probability distribution ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Distribution (mathematics) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,Electrical and Electronic Engineering ,0305 other medical science ,business - Abstract
A large amount of parallel training corpus is necessary for robust, high-quality voice conversion. However, such parallel data may not always be available. This letter presents a new voice conversion method that needs no parallel speech corpus, and adopts a restricted Boltzmann machine (RBM) to represent the distribution of the spectral features derived from a target speaker. A linear transformation was employed to convert the spectral and delta features. A conversion function was obtained by maximizing the conditional probability density function with respect to the target RBM. A feasibility test was carried out on the OGI VOICES corpus. Results from the subjective listening tests and the objective results both showed that the proposed method outperforms the conventional GMM-based method.
- Published
- 2017
- Full Text
- View/download PDF