Back to Search Start Over

Partially observable Markov decision processes and performance sensitivity analysis

Authors :
Li, Yanjie
Yin, Baoqun
Xi, Hongsheng
Source :
IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics. Dec, 2008, Vol. 38 Issue 6, p1645, 7 p.
Publication Year :
2008

Abstract

The sensitivity-based optimization of Markov systems has become an increasingly important area. From the perspective of performance sensitivity analysis, policy-iteration algorithms and gradient estimation methods can be directly obtained for Markov decision processes (MDPs). In this correspondence, the sensitivity-based optimization is extended to average reward partially observable MDPs (POMDPs). We derive the performance-difference and performance-derivative formulas of POMDPs. On the basis of the performance-derivative formula, we present a new method to estimate the performance gradients. From the performance-difference formula, we obtain a sufficient optimality condition without the discounted reward formulation. We also propose a policy-iteration algorithm to obtain a nearly optimal finite-state-controller policy. Index Terms--Finite-state controller (FSC), gradient estimation, partially observable Markov decision processes (POMDPs), policy iteration, sensitivity analysis.

Details

Language :
English
ISSN :
10834419
Volume :
38
Issue :
6
Database :
Gale General OneFile
Journal :
IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics
Publication Type :
Academic Journal
Accession number :
edsgcl.190149441