Back to Search
Start Over
A Recipe for Global Convergence Guarantee in Deep Neural Networks
- Source :
- Proceedings of the AAAI Conference on Artificial Intelligence, Proceedings of the AAAI Conference on Artificial Intelligence, Feb 2021, Virtual, United States
- Publication Year :
- 2021
- Publisher :
- arXiv, 2021.
-
Abstract
- Existing global convergence guarantees of (stochastic) gradient descent do not apply to practical deep networks in the practical regime of deep learning beyond the neural tangent kernel (NTK) regime. This paper proposes an algorithm, which is ensured to have global convergence guarantees in the practical regime beyond the NTK regime, under a verifiable condition called the expressivity condition. The expressivity condition is defined to be both data-dependent and architecture-dependent, which is the key property that makes our results applicable for practical settings beyond the NTK regime. On the one hand, the expressivity condition is theoretically proven to hold data-independently for fully-connected deep neural networks with narrow hidden layers and a single wide layer. On the other hand, the expressivity condition is numerically shown to hold data-dependently for deep (convolutional) ResNet with batch normalization with various standard image datasets. We also show that the proposed algorithm has generalization performances comparable with those of the heuristic algorithm, with the same hyper-parameters and total number of iterations. Therefore, the proposed algorithm can be viewed as a step towards providing theoretical guarantees for deep learning in the practical regime.<br />Comment: Published in AAAI 2021
- Subjects :
- [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI]
FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Science - Artificial Intelligence
Computer Vision and Pattern Recognition (cs.CV)
[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]
Computer Science - Computer Vision and Pattern Recognition
[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
[MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC]
Machine Learning (stat.ML)
General Medicine
[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]
[STAT.ML] Statistics [stat]/Machine Learning [stat.ML]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Machine Learning (cs.LG)
[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
Artificial Intelligence (cs.AI)
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
Statistics - Machine Learning
Optimization and Control (math.OC)
FOS: Mathematics
[MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC]
Mathematics - Optimization and Control
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the AAAI Conference on Artificial Intelligence, Proceedings of the AAAI Conference on Artificial Intelligence, Feb 2021, Virtual, United States
- Accession number :
- edsair.doi.dedup.....2059efaba1e2e5c990d23d16c91bec78
- Full Text :
- https://doi.org/10.48550/arxiv.2104.05785