1. Optimizing Multilingual Knowledge Transfer for Time-Delay Neural Networks with Low-Rank Factorization
- Author
-
Man-Hung Siu, William Hartmann, Francis Keith, Jeff Z. Ma, and Owen Kimball
- Subjects
Artificial neural network ,Time delay neural network ,business.industry ,Computer science ,Acoustic model ,Initialization ,Mutual information ,Machine learning ,computer.software_genre ,Data modeling ,Reduction (complexity) ,Artificial intelligence ,business ,Knowledge transfer ,computer - Abstract
When producing speech-to-text (STT) systems on a lower resource language, it is often beneficial to use knowledge obtained from a significantly larger multilingual dataset. We have seen benefits from using a multilingual TDNN as initialization for training an acoustic model on a target low resource language. In this work, we expand upon recent research that found benefits from applying sequential low-rank factorization (LRF) by extending it to a TDNN acoustic model trained on a large multilingual corpus. We also examine and optimize the knowledge transfer methodology, with the goal of avoiding the loss of useful information from the multilingual initialization during the knowledge transfer process. Our approach limits the updates to the multilingual network parameters during lattice-free maximum mutual information (LF-MMI) training on the target low resource language by fixing the multilingual network parameters and only optimizing the target output layer. The multilingual parameters and new output layer are jointly optimized using the state-level minimum Bayes risk (sMBR) objective function. By combining sequential LRF with this optimization method, we show across low resource target languages an average absolute WER reduction of 1.2%, yielding a better result than our previous best approach.
- Published
- 2018
- Full Text
- View/download PDF