Back to Search Start Over

Generalizing I-Vector Estimation for Rapid Speaker Recognition

Authors :
Xu, Longting
Lee, Kong Aik
Li, Haizhou
Yang, Zhen
Source :
IEEE-ACM Transactions on Audio, Speech, and Language Processing; 2018, Vol. 26 Issue: 4 p749-759, 11p
Publication Year :
2018

Abstract

An i-vector is a compact representation that captures both the speaker and session variabilities rendered in a spoken utterance. Over the past years, it has prevailed over other techniques and is now the de facto representation for text-independent speaker recognition. Standard i-vector extraction requires intense computation at run-time. Reducing the computation will allow effective use of i-vector in more applications. Such intense computation arises from the posterior covariance matrix, when estimating the i-vector. There have been studies on how to simplify the computation of posterior covariance matrix with modest success. In this paper, we propose a novel approach to i-vector extraction without the need to evaluate the full posterior covariance thereby speeding up the run-time extraction process. This is achieved by generalizing the i-vector estimation in two ways. First, we introduce the use of occupancy reweighting in conjunction with whitening over the Baum-Welch statistics as part of the preprocessing step. Second, we introduce the so-called subspace-orthogonalizing prior (SOP) to replace the standard Gaussian prior in i-vector formulation. Experiments conducted on the extended-core task of NIST SRE'10 show that the proposed rapid SOP approach achieves considerable speed-up over the standard i-vector with comparable equal error rates.

Details

Language :
English
ISSN :
23299290
Volume :
26
Issue :
4
Database :
Supplemental Index
Journal :
IEEE-ACM Transactions on Audio, Speech, and Language Processing
Publication Type :
Periodical
Accession number :
ejs45178836
Full Text :
https://doi.org/10.1109/TASLP.2018.2793670