Back to Search Start Over

On shallow planning under partial observability

Authors :
Lefebvre, Randy
Durand, Audrey
Publication Year :
2024

Abstract

Formulating a real-world problem under the Reinforcement Learning framework involves non-trivial design choices, such as selecting a discount factor for the learning objective (discounted cumulative rewards), which articulates the planning horizon of the agent. This work investigates the impact of the discount factor on the biasvariance trade-off given structural parameters of the underlying Markov Decision Process. Our results support the idea that a shorter planning horizon might be beneficial, especially under partial observability.<br />Comment: Presented at deployable RL (RLC conference 2024)

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2407.15820
Document Type :
Working Paper