17 results on '"Jourani, Reda"'
Search Results
2. Discriminative speaker recognition using large margin GMM
- Author
-
Jourani, Reda, Daoudi, Khalid, André-Obrecht, Régine, and Aboutajdine, Driss
- Published
- 2013
- Full Text
- View/download PDF
3. Speaker Identification Using Discriminative Learning of Large Margin GMM
- Author
-
Daoudi, Khalid, primary, Jourani, Reda, additional, André-Obrecht, Régine, additional, and Aboutajdine, Driss, additional
- Published
- 2011
- Full Text
- View/download PDF
4. Building An Automatic Speech Recognition System for Home Automation
- Author
-
Aboulkhir, Mohamed, primary, Khoulji, Samira, additional, Jourani, Reda, additional, and Kerkeb, M.L, additional
- Published
- 2017
- Full Text
- View/download PDF
5. Building a Smart Interactive Kiosk for Tourist Assistance
- Author
-
Amessafi, Hanane, primary, Jourani, Reda, additional, Echchelh, Adil, additional, and Yakhlef, Houssain Oulad, additional
- Published
- 2017
- Full Text
- View/download PDF
6. Integration of the ASR Toolkit Kaldi Into a Domoticz Home Automation System
- Author
-
Aboulhir, Mohamed, primary, Khoulji, Samira, additional, Jourani, Reda, additional, and Kerkeb, ML, additional
- Published
- 2017
- Full Text
- View/download PDF
7. Combination of SVM and Large Margin GMM modeling for speaker identification
- Author
-
Jourani, Reda, Daoudi, Khalid, André-Obrecht, Régine, Aboutajdine, Driss, Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT), University of Mohammed V, Geometry and Statistics in acquisition data (GeoStat), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Toulouse III - Paul Sabatier (UT3), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), and Université Mohammed V de Rabat [Agdal] (UM5)
- Subjects
030507 speech-language pathology & audiology ,03 medical and health sciences ,ComputingMethodologies_PATTERNRECOGNITION ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,0202 electrical engineering, electronic engineering, information engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,02 engineering and technology ,0305 other medical science ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing - Abstract
International audience; Most state-of-the-art speaker recognition systems are partially or completely based on Gaussian mixture models (GMM). GMM have been widely and successfully used in speaker recognition during the last decades. They are traditionally estimated from a world model using the generative criterion of Maximum A Posteriori. In an earlier work, we proposed an efficient algorithm for discriminative learning of GMM with diagonal covariances under a large margin criterion. In this paper, we evaluate the combination of the large margin GMM modeling approach with SVM in the setting of speaker identification. We carry out a full NIST speaker identification task using NIST-SRE'2006 data, in a Symmetrical Factor Analysis compensation scheme. The results show that the two modeling approaches are complementary and that their combination outperforms their single use.
- Published
- 2013
8. Speaker recognition using discriminative learning of Large Margin GMM
- Author
-
Jourani, Reda, Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Université Paul Sabatier - Toulouse III, Régine André-Obrecht, Cotutelle internationale avec l'Université Mohammed V Agdal - Faculté des Sciences de Rabat, and Maroc.
- Subjects
Discriminative learning ,session variability modeling ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,speaker recognition ,reconnaissance du locuteur ,compensation de la variabilité inter-sessions ,maximisation de la marge ,Gaussian mixture models ,Modèles de Mélange de lois Gaussiennes ,Apprentissage discriminant ,large margin training ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing - Abstract
Most of state-of-the-art speaker recognition systems are based on Gaussian Mixture Models (GMM), trained using maximum likelihood estimation and maximum a posteriori (MAP) estimation. The generative training of the GMM does not however directly optimize the classification performance. For this reason, discriminative models, e.g., Support Vector Machines (SVM), have been an interesting alternative since they address directly the classification problem, and they lead to good performances. Recently a new discriminative approach for multiway classification has been proposed, the Large Margin Gaussian mixture models (LM-GMM). As in SVM, the parameters of LM-GMM are trained by solving a convex optimization problem. However they differ from SVM by using ellipsoids to model the classes directly in the input space, instead of half-spaces in an extended high-dimensional space. While LM-GMM have been used in speech recognition, they have not been used in speaker recognition (to the best of our knowledge). In this thesis, we propose simplified, fast and more efficient versions of LM-GMM which exploit the properties and characteristics of speaker recognition applications and systems, the LM-dGMM models. In our LM-dGMM modeling, each class is initially modeled by a GMM trained by MAP adaptation of a Universal Background Model (UBM) or directly initialized by the UBM. The models mean vectors are then re-estimated under some Large Margin constraints. We carried out experiments on full speaker recognition tasks under the NIST-SRE 2006 core condition. The experimental results are very satisfactory and show that our Large Margin modeling approach is very promising.; Depuis plusieurs dizaines d'années, la reconnaissance automatique du locuteur (RAL) fait l'objet de travaux de recherche entrepris par de nombreuses équipes dans le monde. La majorité des systèmes actuels sont basés sur l'utilisation des Modèles de Mélange de lois Gaussiennes (GMM) et/ou des modèles discriminants SVM, i.e., les machines à vecteurs de support. Nos travaux ont pour objectif général la proposition d'utiliser de nouveaux modèles GMM à grande marge pour la RAL qui soient une alternative aux modèles GMM génératifs classiques et à l'approche discriminante état de l'art GMM-SVM. Nous appelons ces modèles LM-dGMM pour Large Margin diagonal GMM. Nos modèles reposent sur une récente technique discriminante pour la séparation multi-classes, qui a été appliquée en reconnaissance de la parole. Exploitant les propriétés des systèmes GMM utilisés en RAL, nous présentons dans cette thèse des variantes d'algorithmes d'apprentissage discriminant des GMM minimisant une fonction de perte à grande marge. Des tests effectués sur les tâches de reconnaissance du locuteur de la campagne d'évaluation NIST-SRE 2006 démontrent l'intérêt de ces modèles en reconnaissance.
- Published
- 2012
9. Large Margin GMM for discriminative speaker verifi cation
- Author
-
Jourani, Reda, Daoudi, Khalid, André-Obrecht, Régine, Aboutajdine, Driss, Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Geometry and Statistics in acquisition data (GeoStat), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Toulouse III - Paul Sabatier (UT3), Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT), University of Mohammed V, Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), and Université Mohammed V de Rabat [Agdal] (UM5)
- Subjects
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing - Abstract
International audience; Gaussian mixture models (GMM), trained using the generative cri- terion of maximum likelihood estimation, have been the most popular ap- proach in speaker recognition during the last decades. This approach is also widely used in many other classi cation tasks and applications. Generative learning in not however the optimal way to address classi cation problems. In this paper we rst present a new algorithm for discriminative learning of diagonal GMM under a large margin criterion. This algorithm has the ma- jor advantage of being highly e cient, which allow fast discriminative GMM training using large scale databases. We then evaluate its performances on a full NIST speaker veri cation task using NIST-SRE'2006 data. In particular, we use the popular Symmetrical Factor Analysis (SFA) for session variability compensation. The results show that our system outperforms the state-of-the- art approaches of GMM-SFA and the SVM-based one, GSL-NAP. Relative reductions of the Equal Error Rate of about 9.33% and 14.88% are respec- tively achieved over these systems.
- Published
- 2012
10. Apprentissage discriminant des GMM à grande marge pour la vérification automatique du locuteur
- Author
-
Jourani, Reda, Daoudi, Khalid, André-Obrecht, Régine, Aboutajdine, Driss, Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), Geometry and Statistics in acquisition data (GeoStat), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Toulouse III - Paul Sabatier (UT3), Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT), Université Mohammed V de Rabat [Agdal] (UM5), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, and University of Mohammed V
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing - Abstract
National audience; Gaussian mixture models (GMM) have been widely and successfully used in speaker recognition during the last decades. They are generally trained using the generative criterion of maximum likelihood estimation. In an earlier work, we proposed an algorithm for discriminative training of GMM with diagonal covariances under a large margin criterion. In this paper, we present a new version of this algorithm which has the major advantage of being computationally highly efficient. The resulting algorithm is thus well suited to handle large scale databases. To show the effectiveness of the new algorithm, we carry out a full NIST speaker verification task using NIST-SRE'2006 data. The results show that our system outperforms the baseline GMM, and with high computational efficiency.
- Published
- 2011
11. Large Margin Gaussian mixture models for speaker identification
- Author
-
Jourani, Reda, Daoudi, Khalid, André-Obrecht, Régine, Aboutajdine, Driss, Geometry and Statistics in acquisition data (GeoStat), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), Université Toulouse III - Paul Sabatier (UT3), Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT), Université Mohammed V de Rabat [Agdal] (UM5), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, and University of Mohammed V
- Subjects
speaker recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,discriminative learning ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,ComputingMethodologies_PATTERNRECOGNITION ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,large margin learning ,GMM-UBM ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing - Abstract
International audience; Gaussian mixture models (GMM) have been widely and successfully used in speaker recognition during the last decade. However, they are generally trained using the generative criterion of maximum likelihood estimation. In this paper, we propose a simple and efficient discriminative approach to learn GMM with a large margin criterion to solve the classification problem. Our approach is based on a recent work about the Large Margin GMM (LM-GMM) where each class is modeled by a mixture of ellipsoids and which has shown good results in speech recognition. We propose a simplification of the original algorithm and carry out preliminary experiments on a speaker identification task using NIST-SRE'2006 data. We compare the traditional generative GMM approach, the original LM-GMM one and our own version. The results suggest that our algorithm outperforms the two others.
- Published
- 2010
12. Cleaning statistical language models
- Author
-
Jourani, Reda, Langlois, David, Smaïli, Kamel, Daoudi, Khalid, Aboutajdine, Driss, Geometry and Statistics in acquisition data (GeoStat), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Analysis, perception and recognition of speech (PAROLE), INRIA Lorraine, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT), Université Mohammed V de Rabat [Agdal] (UM5), and University of Mohammed V
- Subjects
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; In this paper, we describe how to decide a n-gram is actually impossible in a language. We use decision rules on a corpus tagged with POS. These rules are based on statistics and phonological criteria. In terms of statistical language modeling, deciding that a n-gram is impossible leads to assign to it a null probability.We defer on the possible n-grams the released mass of probabilities. To do this, we define a new formulation of P(w|h). We apply the principle of impossible events to bigrams. Then we use the list of impossible bigrams to build a list of impossible trigrams. The new trigram model exceeds the baseline model by 5.53% in terms of perplexity.
- Published
- 2010
13. Reconnaissance automatique du locuteur par des GMM à grande marge
- Author
-
Jourani, Reda and Jourani, Reda
- Abstract
Depuis plusieurs dizaines d'années, la reconnaissance automatique du locuteur (RAL) fait l'objet de travaux de recherche entrepris par de nombreuses équipes dans le monde. La majorité des systèmes actuels sont basés sur l'utilisation des Modèles de Mélange de lois Gaussiennes (GMM) et/ou des modèles discriminants SVM, i.e., les machines à vecteurs de support. Nos travaux ont pour objectif général la proposition d'utiliser de nouveaux modèles GMM à grande marge pour la RAL qui soient une alternative aux modèles GMM génératifs classiques et à l'approche discriminante état de l'art GMM-SVM. Nous appelons ces modèles LM-dGMM pour Large Margin diagonal GMM. Nos modèles reposent sur une récente technique discriminante pour la séparation multi-classes, qui a été appliquée en reconnaissance de la parole. Exploitant les propriétés des systèmes GMM utilisés en RAL, nous présentons dans cette thèse des variantes d'algorithmes d'apprentissage discriminant des GMM minimisant une fonction de perte à grande marge. Des tests effectués sur les tâches de reconnaissance du locuteur de la campagne d'évaluation NIST-SRE 2006 démontrent l'intérêt de ces modèles en reconnaissance., Most of state-of-the-art speaker recognition systems are based on Gaussian Mixture Models (GMM), trained using maximum likelihood estimation and maximum a posteriori (MAP) estimation. The generative training of the GMM does not however directly optimize the classification performance. For this reason, discriminative models, e.g., Support Vector Machines (SVM), have been an interesting alternative since they address directly the classification problem, and they lead to good performances. Recently a new discriminative approach for multiway classification has been proposed, the Large Margin Gaussian mixture models (LM-GMM). As in SVM, the parameters of LM-GMM are trained by solving a convex optimization problem. However they differ from SVM by using ellipsoids to model the classes directly in the input space, instead of half-spaces in an extended high-dimensional space. While LM-GMM have been used in speech recognition, they have not been used in speaker recognition (to the best of our knowledge). In this thesis, we propose simplified, fast and more efficient versions of LM-GMM which exploit the properties and characteristics of speaker recognition applications and systems, the LM-dGMM models. In our LM-dGMM modeling, each class is initially modeled by a GMM trained by MAP adaptation of a Universal Background Model (UBM) or directly initialized by the UBM. The models mean vectors are then re-estimated under some Large Margin constraints. We carried out experiments on full speaker recognition tasks under the NIST-SRE 2006 core condition. The experimental results are very satisfactory and show that our Large Margin modeling approach is very promising.
- Published
- 2012
14. Discriminative speaker recognition using large margin GMM
- Author
-
Jourani, Reda, primary, Daoudi, Khalid, additional, André-Obrecht, Régine, additional, and Aboutajdine, Driss, additional
- Published
- 2012
- Full Text
- View/download PDF
15. Fast training of Large Margin diagonal Gaussian mixture models for speaker identification
- Author
-
Jourani, Reda, primary, Daoudi, Khalid, additional, Andre-Obrecht, Regine, additional, and Aboutajdine, Driss, additional
- Published
- 2011
- Full Text
- View/download PDF
16. Speaker verification using large margin GMM discriminative training
- Author
-
Jourani, Reda, primary, Daoudi, Khalid, additional, Andre-Obrecht, Regine, additional, and Aboutajdine, Driss, additional
- Published
- 2011
- Full Text
- View/download PDF
17. Large margin Gaussian mixture models for speaker identification
- Author
-
Jourani, Reda, primary, Daoudi, Khalid, additional, André-Obrecht, Régine, additional, and Aboutajdine, Driss, additional
- Published
- 2010
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.