Back to Search Start Over

An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes.

Authors :
Hu, Jiaqiao
Fu, Michael C.
Ramezani, Vahid R.
Marcus, Steven I.
Source :
INFORMS Journal on Computing. Spring2007, Vol. 19 Issue 2, p161-174. 14p. 1 Diagram, 7 Charts, 3 Graphs.
Publication Year :
2007

Abstract

This paper presents a new randomized search method called evolutionary random policy search (ERPS) for solving infinite-horizon discounted-cost Markov-decision-process (MDP) problems. The algorithm is particularly targeted at problems with large or uncountable action spaces. ERPS approaches a given MDP by iteratively dividing it into a sequence of smaller, random, sub-MDP problems based on information obtained from random sampling of the entire action space and local search. Each sub-MDP is then solved approximately by using a variant of the standard policy-improvement technique, where an elite policy is obtained. We show that the sequence of elite policies converges to an optimal policy with probability one. Some numerical studies are carried out to illustrate the algorithm and compare it with existing procedures. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10919856
Volume :
19
Issue :
2
Database :
Academic Search Index
Journal :
INFORMS Journal on Computing
Publication Type :
Academic Journal
Accession number :
25439976
Full Text :
https://doi.org/10.1287/ijoc.1050.0155