Author: "Genevay, Aude" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Genevay, Aude"' showing total 16 results

Start Over Author "Genevay, Aude"

16 results on '"Genevay, Aude"'

1. Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark

Author: Korotin, Alexander, Li, Lingxiao, Genevay, Aude, Solomon, Justin, Filippov, Alexander, and Burnaev, Evgeny
Subjects: Computer Science - Machine Learning
Abstract: Despite the recent popularity of neural network-based solvers for optimal transport (OT), there is no standard quantitative way to evaluate their performance. In this paper, we address this issue for quadratic-cost transport -- specifically, computation of the Wasserstein-2 distance, a commonly-used formulation of optimal transport in machine learning. To overcome the challenge of computing ground truth transport maps between continuous measures needed to assess these solvers, we use input-convex neural networks (ICNN) to construct pairs of measures whose ground truth OT maps can be obtained analytically. This strategy yields pairs of continuous benchmark measures in high-dimensional spaces such as spaces of images. We thoroughly evaluate existing optimal transport solvers using these benchmark measures. Even though these solvers perform well in downstream tasks, many do not faithfully recover optimal transport maps. To investigate the cause of this discrepancy, we further test the solvers in a setting of image generation. Our study reveals crucial limitations of existing solvers and shows that increased OT accuracy does not necessarily correlate to better results downstream.
Published: 2021

2. Large-Scale Wasserstein Gradient Flows

Author: Mokrov, Petr, Korotin, Alexander, Li, Lingxiao, Genevay, Aude, Solomon, Justin, and Burnaev, Evgeny
Subjects: Computer Science - Machine Learning
Abstract: Wasserstein gradient flows provide a powerful means of understanding and solving many diffusion equations. Specifically, Fokker-Planck equations, which model the diffusion of probability measures, can be understood as gradient descent over entropy functionals in Wasserstein space. This equivalence, introduced by Jordan, Kinderlehrer and Otto, inspired the so-called JKO scheme to approximate these diffusion processes via an implicit discretization of the gradient flow in Wasserstein space. Solving the optimization problem associated to each JKO step, however, presents serious computational challenges. We introduce a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications. Our approach relies on input-convex neural networks (ICNNs) to discretize the JKO steps, which can be optimized by stochastic gradient descent. Unlike previous work, our method does not require domain discretization or particle simulation. As a result, we can sample from the measure at each time step of the diffusion and compute its probability density. We demonstrate our algorithm's performance by computing diffusions following the Fokker-Planck equation and apply it to unnormalized density sampling as well as nonlinear filtering.
Published: 2021

3. Improving Approximate Optimal Transport Distances using Quantization

Author: Beugnot, Gaspard, Genevay, Aude, Greenewald, Kristjan, and Solomon, Justin
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Optimal transport (OT) is a popular tool in machine learning to compare probability measures geometrically, but it comes with substantial computational burden. Linear programming algorithms for computing OT distances scale cubically in the size of the input, making OT impractical in the large-sample regime. We introduce a practical algorithm, which relies on a quantization step, to estimate OT distances between measures given cheap sample access. We also provide a variant of our algorithm to improve the performance of approximate solvers, focusing on those for entropy-regularized transport. We give theoretical guarantees on the benefits of this quantization step and display experiments showing that it behaves well in practice, providing a practical approximation algorithm that can be used as a drop-in replacement for existing OT estimators., Comment: Published in the proceedings of the Conference on Uncertainty in Artificial Intelligence 2021 (UAI)
Published: 2021

4. Continuous Regularized Wasserstein Barycenters

Author: Li, Lingxiao, Genevay, Aude, Yurochkin, Mikhail, and Solomon, Justin
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Wasserstein barycenters provide a geometrically meaningful way to aggregate probability distributions, built on the theory of optimal transport. They are difficult to compute in practice, however, leading previous work to restrict their supports to finite sets of points. Leveraging a new dual formulation for the regularized Wasserstein barycenter problem, we introduce a stochastic algorithm that constructs a continuous approximation of the barycenter. We establish strong duality and use the corresponding primal-dual relationship to parametrize the barycenter implicitly using the dual potentials of regularized transport problems. The resulting problem can be solved with stochastic gradient descent, which yields an efficient online algorithm to approximate the barycenter of continuous distributions given sample access. We demonstrate the effectiveness of our approach and compare against previous work on synthetic examples and real-world applications.
Published: 2020

5. Differentiable Deep Clustering with Cluster Size Constraints

Author: Genevay, Aude, Dulac-Arnold, Gabriel, and Vert, Jean-Philippe
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Clustering is a fundamental unsupervised learning approach. Many clustering algorithms -- such as $k$-means -- rely on the euclidean distance as a similarity measure, which is often not the most relevant metric for high dimensional data such as images. Learning a lower-dimensional embedding that can better reflect the geometry of the dataset is therefore instrumental for performance. We propose a new approach for this task where the embedding is performed by a differentiable model such as a deep neural network. By rewriting the $k$-means clustering algorithm as an optimal transport task, and adding an entropic regularization, we derive a fully differentiable loss function that can be minimized with respect to both the embedding parameters and the cluster parameters via stochastic gradient descent. We show that this new formulation generalizes a recently proposed state-of-the-art method based on soft-$k$-means by adding constraints on the cluster sizes. Empirical evaluations on image classification benchmarks suggest that compared to state-of-the-art methods, our optimal transport-based approach provide better unsupervised accuracy and does not require a pre-training phase.
Published: 2019

6. Sample Complexity of Sinkhorn divergences

Author: Genevay, Aude, Chizat, Lénaic, Bach, Francis, Cuturi, Marco, and Peyré, Gabriel
Subjects: Mathematics - Statistics Theory
Abstract: Optimal transport (OT) and maximum mean discrepancies (MMD) are now routinely used in machine learning to compare probability measures. We focus in this paper on \emph{Sinkhorn divergences} (SDs), a regularized variant of OT distances which can interpolate, depending on the regularization strength $\varepsilon$, between OT ($\varepsilon=0$) and MMD ($\varepsilon=\infty$). Although the tradeoff induced by that regularization is now well understood computationally (OT, SDs and MMD require respectively $O(n^3\log n)$, $O(n^2)$ and $n^2$ operations given a sample size $n$), much less is known in terms of their \emph{sample complexity}, namely the gap between these quantities, when evaluated using finite samples \emph{vs.} their respective densities. Indeed, while the sample complexity of OT and MMD stand at two extremes, $1/n^{1/d}$ for OT in dimension $d$ and $1/\sqrt{n}$ for MMD, that for SDs has only been studied empirically. In this paper, we \emph{(i)} derive a bound on the approximation error made with SDs when approximating OT as a function of the regularizer $\varepsilon$, \emph{(ii)} prove that the optimizers of regularized OT are bounded in a Sobolev (RKHS) ball independent of the two measures and \emph{(iii)} provide the first sample complexity bound for SDs, obtained,by reformulating SDs as a maximization problem in a RKHS. We thus obtain a scaling in $1/\sqrt{n}$ (as in MMD), with a constant that depends however on $\varepsilon$, making the bridge between OT and MMD complete.
Published: 2018

7. Wasserstein Measure Coresets

Author: Claici, Sebastian, Genevay, Aude, and Solomon, Justin
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: The proliferation of large data sets and Bayesian inference techniques motivates demand for better data sparsification. Coresets provide a principled way of summarizing a large dataset via a smaller one that is guaranteed to match the performance of the full data set on specific problems. Classical coresets, however, neglect the underlying data distribution, which is often continuous. We address this oversight by introducing Wasserstein measure coresets, an extension of coresets which by definition takes into account generalization. Our formulation of the problem, which essentially consists in minimizing the Wasserstein distance, is solvable via stochastic gradient descent. This yields an algorithm which simply requires sample access to the data distribution and is able to handle large data streams in an online manner. We validate our construction for inference and clustering.
Published: 2018

8. GAN and VAE from an Optimal Transport Point of View

Author: Genevay, Aude, Peyré, Gabriel, and Cuturi, Marco
Subjects: Statistics - Machine Learning
Abstract: This short article revisits some of the ideas introduced in arXiv:1701.07875 and arXiv:1705.07642 in a simple setup. This sheds some lights on the connexions between Variational Autoencoders (VAE), Generative Adversarial Networks (GAN) and Minimum Kantorovitch Estimators (MKE).
Published: 2017

9. Learning Generative Models with Sinkhorn Divergences

Author: Genevay, Aude, Peyré, Gabriel, and Cuturi, Marco
Subjects: Statistics - Machine Learning
Abstract: The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function.
Published: 2017

10. The Negotiation Dialogue Game

Author: Laroche, Romain, Genevay, Aude, Jokinen, Kristiina, editor, and Wilcock, Graham, editor
Published: 2017
Full Text: View/download PDF

11. The Negotiation Dialogue Game

Author: Laroche, Romain, primary and Genevay, Aude, additional
Published: 2016
Full Text: View/download PDF

12. Sample Complexity of Sinkhorn divergences

Author: Genevay, Aude, Chizat, L��naic, Bach, Francis, Cuturi, Marco, Peyr��, Gabriel, CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Méthodes numériques pour le problème de Monge-Kantorovich et Applications en sciences sociales (MOKAPLAN), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Statistical Machine Learning and Parsimony (SIERRA), Département d'informatique - ENS Paris (DI-ENS), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria), Graduate School of Informatics [Kyoto], Kyoto University, Département de Mathématiques et Applications - ENS Paris (DMA), Kamalika Chaudhuri, Masashi Sugiyama, Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Département d'informatique de l'École normale supérieure (DI-ENS), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Kyoto University [Kyoto], Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Paris (ENS Paris), and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Paris (ENS Paris)
Subjects: [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, FOS: Mathematics, [MATH.MATH-IT]Mathematics [math]/Information Theory [math.IT], Mathematics - Statistics Theory, Statistics Theory (math.ST), [MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC], [MATH.MATH-NA]Mathematics [math]/Numerical Analysis [math.NA]
Abstract: International audience; Optimal transport (OT) and maximum mean discrepancies (MMD) are now routinely used in machine learning to compare probability measures. We focus in this paper on \emph{Sinkhorn divergences} (SDs), a regularized variant of OT distances which can interpolate, depending on the regularization strength $\varepsilon$, between OT ($\varepsilon=0$) and MMD ($\varepsilon=\infty$). Although the tradeoff induced by that regularization is now well understood computationally (OT, SDs and MMD require respectively $O(n^3\log n)$, $O(n^2)$ and $n^2$ operations given a sample size $n$), much less is known in terms of their \emph{sample complexity}, namely the gap between these quantities, when evaluated using finite samples \emph{vs.} their respective densities. Indeed, while the sample complexity of OT and MMD stand at two extremes, $1/n^{1/d}$ for OT in dimension $d$ and $1/\sqrt{n}$ for MMD, that for SDs has only been studied empirically. In this paper, we \emph{(i)} derive a bound on the approximation error made with SDs when approximating OT as a function of the regularizer $\varepsilon$, \emph{(ii)} prove that the optimizers of regularized OT are bounded in a Sobolev (RKHS) ball independent of the two measures and \emph{(iii)} provide the first sample complexity bound for SDs, obtained,by reformulating SDs as a maximization problem in a RKHS. We thus obtain a scaling in $1/\sqrt{n}$ (as in MMD), with a constant that depends however on $\varepsilon$, making the bridge between OT and MMD complete.
Published: 2019

13. Entropy-regularized Optimal Transport for Machine Learning

Author: Genevay, Aude, CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Université Paris sciences et lettres, and Gabriel Peyré
Subjects: Machine Learning, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], Optimal Transport, Transport Optimal, Apprentissage Statistique
Abstract: This thesis proposes theoretical and numerical contributions to use Entropy-regularized Optimal Transport (EOT) for machine learning. We introduce Sinkhorn Divergences (SD), a class of discrepancies between probability measures based on EOT which interpolates between two other well-known discrepancies: Optimal Transport (OT) and Maximum Mean Discrepancies (MMD). We develop an ecient numerical method to use SD for density ﬁtting tasks, showing that a suitable choice of regularization can improve performance over existing methods. We derive a sample complexity theorem for SD which proves that choosing a large enough regularization parameter allows to break the curse of dimensionality from OT, and recover asymptotic rates similar to MMD.We propose and analyze stochastic optimization solvers for EOT, which yield online methods that can cope with arbitrary measures and are well suited to large scale problems, contrarily to existing discrete batch solvers.; Le Transport Optimal régularisé par l’Entropie (TOE) permet de déﬁnir les Divergences de Sinkhorn (DS), une nouvelle classe de distance entre mesures de probabilités basées sur le TOE. Celles-ci permettent d’interpoler entre deux autres distances connues : le Transport Optimal (TO) et l’Ecart Moyen Maximal (EMM). Les DS peuvent être utilisées pour apprendre des modèles probabilistes avec de meilleures performances que les algorithmes existants pour une régularisation adéquate. Ceci est justiﬁé par un théorème sur l’approximation des SD par des échantillons, prouvant qu’une régularisation sus ante permet de se débarrasser de la malédiction de la dimension du TO, et l’on retrouve à l’inﬁni le taux de convergence des EMM. Enﬁn, nous présentons de nouveaux algorithmes de résolution pour le TOE basés sur l’optimisation stochastique « en-ligne » qui, contrairement à l’état de l’art, ne se restreignent pas aux mesures discrètes et s’adaptent bien aux problèmes de grande dimension.
Published: 2019

14. Régularisation Entropique du Transport Optimal pour le Machine Learning

Author: Genevay, Aude, Méthodes numériques pour le problème de Monge-Kantorovich et Applications en sciences sociales (MOKAPLAN), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Département de Mathématiques et Applications - ENS Paris (DMA), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Université Paris sciences et lettres (PSL), PSL University, Gabriel Peyré (gabriel.peyre@ceremade.dauphine.fr), Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Inria de Paris, and Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris)
Subjects: Transport optimal, Apprentissage statistique, [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST], Machine learning, Optimal transport, [MATH]Mathematics [math], [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Abstract: This thesis proposes theoretical and numerical contributions to use Entropy-regularized Optimal Transport (EOT) for machine learning. We introduce Sinkhorn Divergences (SD), a class of discrepancies between probability measures based on EOT which interpolates between two other well-known discrepancies: Optimal Transport (OT) and Maximum Mean Discrepancies (MMD). We develop an efficient numerical method to use SD for density fitting tasks, showing that a suitable choice of regularization can improve performance over existing methods. We derive a sample complexity theorem for SD which proves that choosing a large enough regularization parameter allows to break the curse of dimensionality from OT, and recover asymptotic rates similar to MMD. We propose and analyze stochastic optimization solvers for EOT, which yield online methods that can cope with arbitrary measures and are well suited to large scale problems, contrarily to existing discrete batch solvers.; Le Transport Optimal régularisé par l’Entropie (TOE) permet de définir les Divergences de Sinkhorn (DS), une nouvelle classe de distance entre mesures de probabilités basées sur le TOE. Celles-ci permettent d’interpoler entre deux autres dis- tances connues: le Transport Optimal (TO) et l’Ecart Moyen Maxi- mal (EMM). Les DS peuvent être utilisées pour apprendre des modèles probabilistes avec de meilleures performances que les algorithmes existants pour une régularisation adéquate. Ceci est justifié par un théorème sur l’approximation des SD par des échantillons, prouvant qu’une régularisation suffisante per- met de se débarrasser de la malédiction de la dimension du TO, et l’on retrouve à l’infini le taux de convergence des EMM. Enfin, nous présentons de nouveaux algorithmes de résolution pour le TOE basés sur l’optimisation stochastique ‘en-ligne’ qui, contrairement à l’état de l’art, ne se restreignent pas aux mesures discrètes et s’adaptent bien aux problèmes de grande dimension.
Published: 2019

15. Transport Optimal pour l'Apprentissage Automatique

Author: Genevay, Aude, CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Université Paris sciences et lettres, and Gabriel Peyré
Subjects: Machine Learning, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], Optimal Transport, Transport Optimal, Apprentissage Statistique
Abstract: This thesis proposes theoretical and numerical contributions to use Entropy-regularized Optimal Transport (EOT) for machine learning. We introduce Sinkhorn Divergences (SD), a class of discrepancies between probability measures based on EOT which interpolates between two other well-known discrepancies: Optimal Transport (OT) and Maximum Mean Discrepancies (MMD). We develop an ecient numerical method to use SD for density ﬁtting tasks, showing that a suitable choice of regularization can improve performance over existing methods. We derive a sample complexity theorem for SD which proves that choosing a large enough regularization parameter allows to break the curse of dimensionality from OT, and recover asymptotic rates similar to MMD.We propose and analyze stochastic optimization solvers for EOT, which yield online methods that can cope with arbitrary measures and are well suited to large scale problems, contrarily to existing discrete batch solvers.; Le Transport Optimal régularisé par l’Entropie (TOE) permet de déﬁnir les Divergences de Sinkhorn (DS), une nouvelle classe de distance entre mesures de probabilités basées sur le TOE. Celles-ci permettent d’interpoler entre deux autres distances connues : le Transport Optimal (TO) et l’Ecart Moyen Maximal (EMM). Les DS peuvent être utilisées pour apprendre des modèles probabilistes avec de meilleures performances que les algorithmes existants pour une régularisation adéquate. Ceci est justiﬁé par un théorème sur l’approximation des SD par des échantillons, prouvant qu’une régularisation sus ante permet de se débarrasser de la malédiction de la dimension du TO, et l’on retrouve à l’inﬁni le taux de convergence des EMM. Enﬁn, nous présentons de nouveaux algorithmes de résolution pour le TOE basés sur l’optimisation stochastique « en-ligne » qui, contrairement à l’état de l’art, ne se restreignent pas aux mesures discrètes et s’adaptent bien aux problèmes de grande dimension.
Published: 2019

16. Learning Generative Models with Sinkhorn Divergences

Author: Genevay, Aude, Peyré, Gabriel, Cuturi, Marco, CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Méthodes numériques pour le problème de Monge-Kantorovich et Applications en sciences sociales (MOKAPLAN), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Inria de Paris, and Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-CEntre de REcherches en MAthématiques de la DEcision (CEREMADE)
Subjects: FOS: Computer and information sciences, Statistics - Machine Learning, Machine Learning (stat.ML), [INFO]Computer Science [cs], [MATH]Mathematics [math]
Abstract: International audience; The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

16 results on '"Genevay, Aude"'

1. Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark

2. Large-Scale Wasserstein Gradient Flows

3. Improving Approximate Optimal Transport Distances using Quantization

4. Continuous Regularized Wasserstein Barycenters

5. Differentiable Deep Clustering with Cluster Size Constraints

6. Sample Complexity of Sinkhorn divergences

7. Wasserstein Measure Coresets

8. GAN and VAE from an Optimal Transport Point of View

9. Learning Generative Models with Sinkhorn Divergences

10. The Negotiation Dialogue Game

11. The Negotiation Dialogue Game

12. Sample Complexity of Sinkhorn divergences

13. Entropy-regularized Optimal Transport for Machine Learning

14. Régularisation Entropique du Transport Optimal pour le Machine Learning

15. Transport Optimal pour l'Apprentissage Automatique

16. Learning Generative Models with Sinkhorn Divergences

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

16 results on '"Genevay, Aude"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources