Robust topological policy iteration for infinite horizon bounded Markov Decision Processes.

Authors :: Reis, Willy Arthur Silva
de Barros, Leliane Nunes
Delgado, Karina Valdivia
Source :: International Journal of Approximate Reasoning. Feb2019, Vol. 105, p287-304. 18p.
Publication Year :: 2019
Abstract: Abstract Markov Decision Processes (mdp s) are commonly used to solve sequential decision problems. A less restrictive model is the Bounded-parameter mdp (bmdp) that allows: (i) the transition function to be expressed in terms of probability intervals and (ii) reasoning about a robust solution, i.e., the best solution under the worst model. In this paper, we propose the Robust Topological Policy Iteration (rtpi) algorithm which is a new policy iteration algorithm for infinite horizon bmdp s based on a partition of the state space. The empirical results show that the more structured the domain, the better is the performance of rtpi. [ABSTRACT FROM AUTHOR]

Subjects :: *MARKOV processes
*PROBABILITY theory
*SURROGATE-based optimization
*ITERATIVE methods (Mathematics)
*ALGORITHMS

Full Text Access

Tools