Back to Search
Start Over
Generalizing I-Vector Estimation for Rapid Speaker Recognition
- Source :
- IEEE-ACM Transactions on Audio, Speech, and Language Processing; 2018, Vol. 26 Issue: 4 p749-759, 11p
- Publication Year :
- 2018
-
Abstract
- An i-vector is a compact representation that captures both the speaker and session variabilities rendered in a spoken utterance. Over the past years, it has prevailed over other techniques and is now the de facto representation for text-independent speaker recognition. Standard i-vector extraction requires intense computation at run-time. Reducing the computation will allow effective use of i-vector in more applications. Such intense computation arises from the posterior covariance matrix, when estimating the i-vector. There have been studies on how to simplify the computation of posterior covariance matrix with modest success. In this paper, we propose a novel approach to i-vector extraction without the need to evaluate the full posterior covariance thereby speeding up the run-time extraction process. This is achieved by generalizing the i-vector estimation in two ways. First, we introduce the use of occupancy reweighting in conjunction with whitening over the Baum-Welch statistics as part of the preprocessing step. Second, we introduce the so-called subspace-orthogonalizing prior (SOP) to replace the standard Gaussian prior in i-vector formulation. Experiments conducted on the extended-core task of NIST SRE'10 show that the proposed rapid SOP approach achieves considerable speed-up over the standard i-vector with comparable equal error rates.
Details
- Language :
- English
- ISSN :
- 23299290
- Volume :
- 26
- Issue :
- 4
- Database :
- Supplemental Index
- Journal :
- IEEE-ACM Transactions on Audio, Speech, and Language Processing
- Publication Type :
- Periodical
- Accession number :
- ejs45178836
- Full Text :
- https://doi.org/10.1109/TASLP.2018.2793670