1. Information-Directed Exploration via Distributional Deep Reinforcement Learning
- Author
-
Zijie He
- Subjects
business.industry ,Computer science ,Sampling (statistics) ,ASTERIX ,Modified method ,Variance (accounting) ,Machine learning ,computer.software_genre ,Reinforcement learning ,Artificial intelligence ,Noise (video) ,business ,computer ,Slightly worse ,Parametric statistics - Abstract
Appropriate exploration strategy is crucial to the success of reinforcement learning tasks. One challenge for efficient explorations is to deal with noise in the reinforcement learning (RL), namely parametric uncertainty and intrinsic uncertainty. Researchers pointed out that intrinsic uncertainty may cause disaster to many common exploration strategies. The paper investigates an information-directed exploration strategy: Information Directed Sampling (IDS), which has been extended to general RL setting due to its merit of modeling both parametric uncertainty and intrinsic uncertainty. A modified version based on existing framework was proposed. Modified and original IDS were compared with in two Atari games: Asterix and Gravitar. It was observed that under the similar computational cost, modified method outperformed the original version in Asterix, and performed slightly worse in Gravitor but with much lower variance. Convincing justifications for the superior of modified method were also provided in the last part.
- Published
- 2021
- Full Text
- View/download PDF