Back to Search
Start Over
Generalization Error Bounds for Noisy, Iterative Algorithms
- Source :
- ISIT
- Publication Year :
- 2018
- Publisher :
- arXiv, 2018.
-
Abstract
- In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the generalization error of empirical risk minimization based on the mutual information $I(S;W)$ between the algorithm input $S$ and the algorithm output $W$, when the loss function is sub-Gaussian. We leverage these results to derive generalization error bounds for a broad class of iterative algorithms that are characterized by bounded, noisy updates with Markovian structure. Our bounds are very general and are applicable to numerous settings of interest, including stochastic gradient Langevin dynamics (SGLD) and variants of the stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm. Furthermore, our error bounds hold for any output function computed over the path of iterates, including the last iterate of the algorithm or the average of subsets of iterates, and also allow for non-uniform sampling of data in successive updates of the algorithm.<br />Comment: A shorter version of this paper was submitted to ISIT 2018. 14 pages, 1 figure
- Subjects :
- FOS: Computer and information sciences
Computer science
Computer Science - Information Theory
Markov process
Machine Learning (stat.ML)
02 engineering and technology
010501 environmental sciences
Overfitting
01 natural sciences
Machine Learning (cs.LG)
Hybrid Monte Carlo
symbols.namesake
Statistics - Machine Learning
0202 electrical engineering, electronic engineering, information engineering
Empirical risk minimization
Langevin dynamics
0105 earth and related environmental sciences
Training set
Information Theory (cs.IT)
Sampling (statistics)
020206 networking & telecommunications
Mutual information
Generalization error
Computer Science - Learning
Stochastic gradient descent
Iterated function
Statistical learning theory
Bounded function
symbols
Algorithm
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- ISIT
- Accession number :
- edsair.doi.dedup.....a6eccb2f12588072344c6e4300c080cf
- Full Text :
- https://doi.org/10.48550/arxiv.1801.04295