1. Comparison of Cepstral Normalization Techniques in Whispered Speech Recognition
- Author
-
GROZDIC, D., JOVICIC, S., SUMARAC PAVLOVIC, D., GALIC, J., and MARKOVIC, B.
- Subjects
automatic speech recognition ,cepstral analysis ,hidden Markov models ,speech analysis ,whisper ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 ,Computer engineering. Computer hardware ,TK7885-7895 - Abstract
This article presents an analysis of different cepstral normalization techniques in automatic recognition of whispered and bimodal speech (speech+whisper). In these experiments, conventional GMM-HMM speech recognizer was used as speaker-dependant automatic speech recognition system with special Whi-Spe corpus containing utterance recordings in normally phonated speech and whisper. The following normalization techniques were tested and compared: CMN (Cepstral Mean Normalization), CVN (Cepstral Variance Normalization), MVN (Cepstral Mean and Variance Normalization), CGN (Cepstral Gain Normalization) and quantile-based dynamic normalization techniques such as QCN and QCN-RASTA. The experimental results show to what extent each of these cepstral normalization techniques can improve whisper recognition accuracy in mismatched train/test scenario. The best result is obtained using CMN in combination with inverse filtering which provides an average 39.9 percent improvement in whisper recognition accuracy for all tested speakers.
- Published
- 2017
- Full Text
- View/download PDF