1. Stochastic Generalized Gradient Methods for Training Nonconvex Nonsmooth Neural Networks.
- Author
-
Norkin, V. I.
- Subjects
- *
STOCHASTIC control theory , *NONSMOOTH optimization , *DISCRETE systems , *ALGORITHMS , *DEEP learning - Abstract
The paper observes a similarity between the stochastic optimal control of discrete dynamical systems and the learning multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. The machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, the so-called generalized-differentiable functions are used. The backpropagation method for calculating stochastic generalized gradients of the learning quality functional for such systems is substantiated basing on Hamilton–Pontryagin formalism. Stochastic generalized gradient learning algorithms are extended for training nonconvex nonsmooth neural networks. The performance of a stochastic generalized gradient algorithm is illustrated by the linear multiclass classification problem. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF