Back to Search
Start Over
Subgoal-Based Reward Shaping to Improve Efficiency in Reinforcement Learning
- Source :
- IEEE Access, Vol 9, Pp 97557-97568 (2021)
- Publication Year :
- 2021
- Publisher :
- IEEE, 2021.
-
Abstract
- Reinforcement learning, which acquires a policy maximizing long-term rewards, has been actively studied. Unfortunately, this learning type is too slow and difficult to use in practical situations because the state-action space becomes huge in real environments. Many studies have incorporated human knowledge into reinforcement Learning. Though human knowledge on trajectories is often used, a human could be asked to control an AI agent, which can be difficult. Knowledge on subgoals may lessen this requirement because humans need only to consider a few representative states on an optimal trajectory in their minds. The essential factor for learning efficiency is rewards. Potential-based reward shaping is a basic method for enriching rewards. However, it is often difficult to incorporate subgoals for accelerating learning over potential-based reward shaping. This is because the appropriate potentials are not intuitive for humans. We extend potential-based reward shaping and propose a subgoal-based reward shaping. The method makes it easier for human trainers to share their knowledge of subgoals. To evaluate our method, we obtained a subgoal series from participants and conducted experiments in three domains, four-rooms(discrete states and discrete actions), pinball(continuous and discrete), and picking(both continuous). We compared our method with a baseline reinforcement learning algorithm and other subgoal-based methods, including random subgoal and naive subgoal-based reward shaping. As a result, we found out that our reward shaping outperformed all other methods in learning efficiency.<br />This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: substantial text overlap with arXiv:2104.06163
- Subjects :
- FOS: Computer and information sciences
Computer Science - Machine Learning
General Computer Science
Computer Science - Artificial Intelligence
Computer science
Control (management)
Space (commercial competition)
Machine Learning (cs.LG)
Reinforcement learning
General Materials Science
Baseline (configuration management)
business.industry
potential-based reward shaping
General Engineering
TK1-9971
Artificial Intelligence (cs.AI)
Trajectory
Task analysis
Robot
Probability distribution
Artificial intelligence
Electrical engineering. Electronics. Nuclear engineering
subgoals as human knowledge
business
reward shaping
Subjects
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 9
- Database :
- OpenAIRE
- Journal :
- IEEE Access
- Accession number :
- edsair.doi.dedup.....56913f6a71baca65b0bc3a30140c1314