Back to Search Start Over

Reward-free exploration beyond finite-horizon

Authors :
Tarbouriech, Jean
Pirotta, Matteo
Valko, Michal
Lazaric, Alessandro
Valko, Michal
Facebook AI Research [Paris] (FAIR)
Facebook
Scool (Scool)
Inria Lille - Nord Europe
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL)
Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)
DeepMind [Paris]
Source :
ICML 2020 Workshop on Theoretical Foundations of Reinforcement Learning, ICML 2020 Workshop on Theoretical Foundations of Reinforcement Learning, 2020, Vienna, France
Publication Year :
2020
Publisher :
HAL CCSD, 2020.

Abstract

International audience; We consider the reward-free exploration framework introduced by Jin et al. (2020), where an RL agent interacts with an unknown environment without any explicit reward function to maximize. The objective is to collect enough information during the exploration phase, so that a near-optimal policy can be immediately computed once any reward function is provided. In this paper, we move from the finite-horizon setting studied by Jin et al. (2020) to the more general setting of goalconditioned RL, often referred to as stochastic shortest path (SSP). We first discuss the challenges specific to SSPs and then study two scenarios: 1) reward-free goal-free exploration in communicating MDPs, and 2) reward-free goal-free incremental exploration in non-communicating MDPs where the agent is provided with a reset action to an initial state. In both cases, we provide exploration algorithms and their samplecomplexity bounds which we contrast with the existing guarantees in the finite-horizon case. 1

Details

Language :
English
Database :
OpenAIRE
Journal :
ICML 2020 Workshop on Theoretical Foundations of Reinforcement Learning, ICML 2020 Workshop on Theoretical Foundations of Reinforcement Learning, 2020, Vienna, France
Accession number :
edsair.dedup.wf.001..54f45fad43f44331c91b73830abf5830