Back to Search Start Over

Large Margin GMM for discriminative speaker verifi cation

Authors :
Jourani, Reda
Daoudi, Khalid
André-Obrecht, Régine
Aboutajdine, Driss
Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA)
Institut de recherche en informatique de Toulouse (IRIT)
Université Toulouse 1 Capitole (UT1)
Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3)
Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP)
Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1)
Université Fédérale Toulouse Midi-Pyrénées
Geometry and Statistics in acquisition data (GeoStat)
Inria Bordeaux - Sud-Ouest
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Université Toulouse III - Paul Sabatier (UT3)
Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT)
University of Mohammed V
Université Toulouse Capitole (UT Capitole)
Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J)
Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3)
Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP)
Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI)
Université Toulouse - Jean Jaurès (UT2J)
Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3)
Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole)
Université de Toulouse (UT)
Université Mohammed V de Rabat [Agdal] (UM5)
Source :
Multimedia Tools and Applications, Multimedia Tools and Applications, Springer Verlag, 2012, Multimedia Tools and Applications, 2012
Publication Year :
2012
Publisher :
HAL CCSD, 2012.

Abstract

International audience; Gaussian mixture models (GMM), trained using the generative cri- terion of maximum likelihood estimation, have been the most popular ap- proach in speaker recognition during the last decades. This approach is also widely used in many other classi cation tasks and applications. Generative learning in not however the optimal way to address classi cation problems. In this paper we rst present a new algorithm for discriminative learning of diagonal GMM under a large margin criterion. This algorithm has the ma- jor advantage of being highly e cient, which allow fast discriminative GMM training using large scale databases. We then evaluate its performances on a full NIST speaker veri cation task using NIST-SRE'2006 data. In particular, we use the popular Symmetrical Factor Analysis (SFA) for session variability compensation. The results show that our system outperforms the state-of-the- art approaches of GMM-SFA and the SVM-based one, GSL-NAP. Relative reductions of the Equal Error Rate of about 9.33% and 14.88% are respec- tively achieved over these systems.

Details

Language :
English
ISSN :
13807501 and 15737721
Database :
OpenAIRE
Journal :
Multimedia Tools and Applications, Multimedia Tools and Applications, Springer Verlag, 2012, Multimedia Tools and Applications, 2012
Accession number :
edsair.dedup.wf.001..bfe6fae5a4ae960a4ef46031a153e394