Back to Search Start Over

Hierarchical reinforcement learning with unlimited option scheduling for sparse rewards in continuous spaces.

Authors :
Huang, Zhigang
Liu, Quan
Zhu, Fei
Zhang, Lihua
Wu, Lan
Source :
Expert Systems with Applications. Mar2024:Part B, Vol. 237, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

The fundamental concept behind option-based hierarchical reinforcement learning (O-HRL) is to obtain temporal coarse-grained actions and abstract complex situations. Although O-HRL is intended for sparse rewards, it remains difficult to extend it to sparse reward problems in continuous spaces. In this paper, we provide a fresh perspective on option technology to comprehend different options based on knowledge representation. The hierarchical reinforcement learning with the unlimited option scheduling (UOS) algorithm is proposed. Unlike conventional O-HRL algorithms that apply a limited set of options with specific meanings, UOS encourages an infinite number of options to correlate with trajectories while maintaining a correlation with each other, thus representing more abundant knowledge. These unlimited options can guide infinite and diverse trajectories to cover fine-grained state spaces. Further, a composite scheduling mode is proposed to generate arbitrary-length trajectories with intrinsic characteristics, providing both flexibility and concentration for unlimited options. It significantly improves the performance and robustness of UOS. Finally, a new comprehensive experimental system is developed, and the experimental results demonstrate the notable success of UOS on sparse reward tasks in continuous spaces. It also identifies the root cause of UOS superiority from the perspective of knowledge representation. • Analyzing the superiority of unlimited options on knowledge representation. • Proposing composite scheduling modes to guide trajectories with characteristics. • Making it more intuitive to comprehend the operating mechanisms of options. • Solving sparse reward problems in continuous spaces by our method. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
237
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
173609287
Full Text :
https://doi.org/10.1016/j.eswa.2023.121467