Start Over

A3C-GS: Adaptive Moment Gradient Sharing With Locks for Asynchronous Actor–Critic Agents.

Authors :: Labao, Alfonso B.
Martija, Mygel Andrei M.
Naval, Prospero C.
Source :: IEEE Transactions on Neural Networks & Learning Systems. Mar2021, Vol. 32 Issue 3, p1162-1176. 15p.
Publication Year :: 2021
Abstract: We propose an asynchronous gradient sharing mechanism for the parallel actor–critic algorithms with improved exploration characteristics. The proposed algorithm (A3C-GS) has the property of automatically diversifying worker policies in the short term for exploration, thereby reducing the need for entropy loss terms. Despite policy diversification, the algorithm converges to the optimal policy in the long term. We show in our analysis that the gradient sharing operation is a composition of two contractions. The first contraction performs gradient computation, while the second contraction is a gradient sharing operation coordinated by locks. From these two contractions, certain short- and long-term properties result. For the short term, gradient sharing induces temporary heterogeneity in policies for performing needed exploration. In the long term, under a suitably small learning rate and gradient clipping, convergence to the optimal policy is theoretically guaranteed. We verify our results with several high-dimensional experiments and compare A3C-GS against other on-policy policy-gradient algorithms. Our proposed algorithm achieved the highest weighted score. Despite lower entropy weights, it performed well in high-dimensional environments that require exploration due to sparse rewards and those that need navigation in 3-D environments for long survival tasks. It consistently performed better than the base asynchronous advantage actor–critic (A3C) algorithm. [ABSTRACT FROM AUTHOR]

Subjects :: *PARALLEL algorithms
*DEEP learning
*SHARING
*REINFORCEMENT learning
*HEURISTIC algorithms

Details

Language :: English
ISSN :: 2162237X
Volume :: 32
Issue :: 3
Database :: Academic Search Index
Journal :: IEEE Transactions on Neural Networks & Learning Systems
Publication Type :: Periodical
Accession number :: 149122060
Full Text :: https://doi.org/10.1109/TNNLS.2020.2980743

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A3C-GS: Adaptive Moment Gradient Sharing With Locks for Asynchronous Actor–Critic Agents.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A3C-GS: Adaptive Moment Gradient Sharing With Locks for Asynchronous Actor–Critic Agents.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources