Back to Search Start Over

Efficient Reinforcement Learning With the Novel N-Step Method and V-Network

Authors :
Zhang, Miaomiao
Zhang, Shuo
Wu, Xinying
Shi, Zhiyi
Deng, Xiangyang
Wu, Edmond Q.
Xu, Xin
Source :
IEEE Transactions on Cybernetics; October 2024, Vol. 54 Issue: 10 p6048-6057, 10p
Publication Year :
2024

Abstract

The application of reinforcement learning (RL) in artificial intelligence has become increasingly widespread. However, its drawbacks are also apparent, as it requires a large number of samples for support, making the enhancement of sample efficiency a research focus. To address this issue, we propose a novel N-step method. This method extends the horizon of the agent, enabling it to acquire more long-term effective information, thus resolving the issue of data inefficiency in RL. Additionally, this N-step method can reduce the estimation variance of Q-function, which is one of the factors contributing to estimation errors in Q-function estimation. Apart from high variance, estimation bias in Q-function estimation is another factor leading to estimation errors. To mitigate the estimation bias of Q-function, we design a regularization method based on the V-function, which has been underexplored. The combination of these two methods perfectly addresses the problems of low sample efficiency and inaccurate Q-function estimation in RL. Finally, extensive experiments conducted in discrete and continuous action spaces demonstrate that the proposed novel N-step method, when combined with classical deep Q-network, deep deterministic policy gradient, and TD3 algorithms, is effective, consistently outperforming the classical algorithms.

Details

Language :
English
ISSN :
21682267
Volume :
54
Issue :
10
Database :
Supplemental Index
Journal :
IEEE Transactions on Cybernetics
Publication Type :
Periodical
Accession number :
ejs67653271
Full Text :
https://doi.org/10.1109/TCYB.2024.3401014