Back to Search Start Over

Function approximation method based on weights gradient descent in reinforcement learning

Authors :
Xiaoyan QIN, Yuhan LIU, Yunlong XU, Bin LI
Source :
网络与信息安全学报, Vol 9, Iss 4, Pp 16-28 (2023)
Publication Year :
2023
Publisher :
POSTS&TELECOM PRESS Co., LTD, 2023.

Abstract

Function approximation has gained significant attention in reinforcement learning research as it effectively addresses problems with large-scale, continuous state, and action space.Although the function approximation algorithm based on gradient descent method is one of the most widely used methods in reinforcement learning, it requires careful tuning of the step size parameter as an inappropriate value can lead to slow convergence, unstable convergence, or even divergence.To address these issues, an improvement was made around the temporal-difference (TD) algorithm based on function approximation.The weight update method was enhanced using both the least squares method and gradient descent, resulting in the proposed weights gradient descent (WGD) method.The least squares were used to calculate the weights, combining the ideas of TD and gradient descent to find the error between the weights.And this error was used to directly update the weights.By this method, the weights were updated in a new manner, effectively reducing the consumption of computing resources by the algorithm enhancing other gradient descent-based function approximation algorithms.The WGD method is widely applicable in various gradient descent-based reinforcement learning algorithms.The results show that WGD method can adjust parameters within a wider space, effectively reducing the possibility of algorithm divergence.Additionally, it achieves better performance while improving the convergence speed of the algorithm.

Details

Language :
English, Chinese
ISSN :
2096109x and 2096109X
Volume :
9
Issue :
4
Database :
Directory of Open Access Journals
Journal :
网络与信息安全学报
Publication Type :
Academic Journal
Accession number :
edsdoj.fa8171f5dfc14ca28efe1f5f086ccde0
Document Type :
article
Full Text :
https://doi.org/10.11959/j.issn.2096-109x.2023050