Accelerating Minibatch Stochastic Gradient Descent Using Typicality Sampling.

Authors :: Peng, Xinyu
Li, Li
Wang, Fei-Yue
Source :: IEEE Transactions on Neural Networks & Learning Systems. Nov2020, Vol. 31 Issue 11, p4649-4659. 11p.
Publication Year :: 2020
Abstract: Machine learning, especially deep neural networks, has developed rapidly in fields, including computer vision, speech recognition, and reinforcement learning. Although minibatch stochastic gradient descent (SGD) is one of the most popular stochastic optimization methods for training deep networks, it shows a slow convergence rate due to the large noise in the gradient approximation. In this article, we attempt to remedy this problem by building a more efficient batch selection method based on typicality sampling, which reduces the error of gradient estimation in conventional minibatch SGD. We analyze the convergence rate of the resulting typical batch SGD algorithm and compare the convergence properties between the minibatch SGD and the algorithm. Experimental results demonstrate that our batch selection scheme works well and more complex minibatch SGD variants can benefit from the proposed batch selection strategy. [ABSTRACT FROM AUTHOR]