Start Over

基于改进好奇心的深度强化学习方法.

Authors :: 乔和
 李增辉
 刘春
 胡嗣栋
Source :: Application Research of Computers / Jisuanji Yingyong Yanjiu. Sep2024, Vol. 41 Issue 9, p2635-2640. 6p.
Publication Year :: 2024
Abstract: In the deep reinforcement learning method, the intrinsic curiosity model (ICM) guides the agent to obtain the opportunity to learn unknown strategies in the sparse reward environment, but the curiosity reward is a state difference value, which will make the agent pay too much attention to the exploration of new states, then could be the problem of blind exploration arises. To solve the above problem, this paper proposed an intrinsic curiosity model algorithm based on knowledge distillation (KD-ICM). Firstly, it introduced the method of knowledge distillation to make the agent acquire more abundant environmental information and strategy knowledge in a short time and accelerate the learning process. Secondly, by pre-training teachers' neural network model to guide the forward network to obtain a forward network model with higher accuracy and performance, reduced the blind exploration of agents. It designed two different simulation experiments on the Unity simulation platform for comparison. The experiments show that in the complex simulation task environment, the average reward of KDICM algorithm is 136% higher than that of ICM, and the optimal action probability is 13.47% higher than that of ICM. Both the exploration performance of the agent and the exploration quality can be improved, and it verifies the feasibility of the algorithm. [ABSTRACT FROM AUTHOR]