1. Sequential voice conversion using grid-based approximation
- Author
-
Hadas Benisty, David Malah, and Koby Crammer
- Subjects
Sequential estimation ,Computer science ,business.industry ,Gaussian ,Bayesian probability ,Process (computing) ,Pattern recognition ,Scale (descriptive set theory) ,Grid ,symbols.namesake ,symbols ,Mel-frequency cepstrum ,Artificial intelligence ,business ,Block (data storage) - Abstract
Common voice conversion methods are based on Gaussian Mixture Modeling (GMM), which requires exhaustive training (typically lasting hours), often leading to ill-conditioning if the dataset used is too small. We propose a new conversion method that is trained in seconds, using either small or large scale datasets. The proposed Grid-Based (GB) method is based on sequential Bayesian tracking, by which the conversion process is expressed as a sequential estimation problem of tracking the target spectrum based on the observed source spectrum. The converted MFCC vectors are sequentially evaluated using a weighted sum of the target training set used as grid-points. To improve the perceived quality of the synthesized signals, we use a post-processing block for enhancing the global variance. Objective and subjective evaluations show that the enhanced-GB method is comparable to classic GMM-based methods, in terms of quality, and comparable to their enhanced versions, in terms of individuality.
- Published
- 2014
- Full Text
- View/download PDF