Start Over

Momentum-based variance-reduced proximal stochastic gradient method for composite nonconvex stochastic optimization

Authors :: Xu, Yangyang
Xu, Yibo
Publication Year :: 2020
Abstract: Stochastic gradient methods (SGMs) have been extensively used for solving stochastic problems or large-scale machine learning problems. Recent works employ various techniques to improve the convergence rate of SGMs for both convex and nonconvex cases. Most of them require a large number of samples in some or all iterations of the improved SGMs. In this paper, we propose a new SGM, named PStorm, for solving nonconvex nonsmooth stochastic problems. With a momentum-based variance reduction technique, PStorm can achieve the optimal complexity result $O(\varepsilon^{-3})$ to produce a stochastic $\varepsilon$-stationary solution, if a mean-squared smoothness condition holds. Different from existing optimal methods, PStorm can achieve the ${O}(\varepsilon^{-3})$ result by using only one or $O(1)$ samples in every update. With this property, PStorm can be applied to online learning problems that favor real-time decisions based on one or $O(1)$ new observations. In addition, for large-scale machine learning problems, PStorm can generalize better by small-batch training than other optimal methods that require large-batch training and the vanilla SGM, as we demonstrate on training a sparse fully-connected neural network and a sparse convolutional neural network.<br />Comment: Yibo Xu joined this work by adding a new proof for the case with constant stepsize and removing the requirement of a large initial minibatch

Subjects :: Mathematics - Optimization and Control
Computer Science - Machine Learning
Mathematics - Numerical Analysis
90C15, 65K05, 68Q25

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2006.00425
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Momentum-based variance-reduced proximal stochastic gradient method for composite nonconvex stochastic optimization

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Momentum-based variance-reduced proximal stochastic gradient method for composite nonconvex stochastic optimization

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources