Start Over

Sketching for Large-Scale Learning of Mixture Models

Authors :: Nicolas Keriven
Anthony Bourrier
Rémi Gribonval
Patrick Pérez
Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio (PANAMA)
Inria Rennes – Bretagne Atlantique
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE (IRISA-D5)
Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA)
CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1)
Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1)
Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA)
Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)
Technicolor [Cesson Sévigné]
Technicolor
European Project: 277906,EC:FP7:ERC,ERC-2011-StG_20101014,PLEASE(2012)
Université de Rennes 1 (UR1)
Université de Rennes (UNIV-RENNES)
Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1)
Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA)
Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)
GIPSA - Vision and Brain Signal Processing (GIPSA-VIBS)
Département Images et Signal (GIPSA-DIS)
Grenoble Images Parole Signal Automatique (GIPSA-lab )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Grenoble Images Parole Signal Automatique (GIPSA-lab )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)
Université de Rennes (UR)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE (IRISA-D5)
Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes 1 (UR1)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Inria Rennes – Bretagne Atlantique
Institut National de Recherche en Informatique et en Automatique (Inria)
Keriven, Nicolas
PLEASE: Projections, Learning, and Sparsity for Efficient data-processing - PLEASE - - EC:FP7:ERC2012-01-01 - 2016-12-31 - 277906 - VALID
Source :: Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Mar 2016, Shanghai, China, Information and Inference, Information and Inference, Oxford University Press (OUP), 2018, 7 (3), pp.447-508. ⟨10.1093/imaiai/iax015⟩, Information and Inference, 2018, 7 (3), pp.447-508. ⟨10.1093/imaiai/iax015⟩, ICASSP, HAL
Publication Year :: 2016
Publisher :: HAL CCSD, 2016.
Abstract: to appear in Information and Inference, a journal of the IMA (available online since December 2017); International audience; Learning parameters from voluminous data can be prohibitive in terms of memory and computational requirements. We propose a " compressive learning " framework where we estimate model parameters from a sketch of the training data. This sketch is a collection of generalized moments of the underlying probability distribution of the data. It can be computed in a single pass on the training set, and is easily computable on streams or distributed datasets. The proposed framework shares similarities with compressive sensing, which aims at drastically reducing the dimension of high-dimensional signals while preserving the ability to reconstruct them. To perform the estimation task, we derive an iterative algorithm analogous to sparse reconstruction algorithms in the context of linear inverse problems. We exemplify our framework with the compressive estimation of a Gaussian Mixture Model (GMM), providing heuristics on the choice of the sketching procedure and theoretical guarantees of reconstruction. We experimentally show on synthetic data that the proposed algorithm yields results comparable to the classical Expectation-Maximization (EM) technique while requiring significantly less memory and fewer computations when the number of database elements is large. We further demonstrate the potential of the approach on real large-scale data (over 10 8 training samples) for the task of model-based speaker verification. Finally, we draw some connections between the proposed framework and approximate Hilbert space embedding of probability distributions using random features. We show that the proposed sketching operator can be seen as an innovative method to design translation-invariant kernels adapted to the analysis of GMMs. We also use this theoretical framework to derive information preservation guarantees, in the spirit of infinite-dimensional compressive sensing.

Subjects :: Statistics and Probability
FOS: Computer and information sciences
[MATH.MATH-PR] Mathematics [math]/Probability [math.PR]
Scale (ratio)
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing
Iterative method
Computer science
compressive sensing
Context (language use)
Machine Learning (stat.ML)
02 engineering and technology
010501 environmental sciences
01 natural sciences
Synthetic data
Machine Learning (cs.LG)
010104 statistics & probability
Dimension (vector space)
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
Compressed Sensing
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Statistics - Machine Learning
0202 electrical engineering, electronic engineering, information engineering
0101 mathematics
0105 earth and related environmental sciences
Numerical Analysis
compressed learning
database sketch
business.industry
Applied Mathematics
020206 networking & telecommunications
Pattern recognition
gaussian mixture models
Mixture model
[STAT.ML] Statistics [stat]/Machine Learning [stat.ML]
Sketch
[MATH.MATH-PR]Mathematics [math]/Probability [math.PR]
Computer Science - Learning
Compressed sensing
Kernel method
Computational Theory and Mathematics
compressive learning
Probability distribution
Artificial intelligence
Heuristics
business
Gaussian mixture
Algorithm
Analysis

Details

Language :: English
ISSN :: 20498764 and 20498772
Database :: OpenAIRE
Journal :: Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Mar 2016, Shanghai, China, Information and Inference, Information and Inference, Oxford University Press (OUP), 2018, 7 (3), pp.447-508. ⟨10.1093/imaiai/iax015⟩, Information and Inference, 2018, 7 (3), pp.447-508. ⟨10.1093/imaiai/iax015⟩, ICASSP, HAL
Accession number :: edsair.doi.dedup.....2ce77f6a1b96b9706b4a95e4660f0d78
Full Text :: https://doi.org/10.1093/imaiai/iax015⟩