Back to Search
Start Over
Partially observable Markov decision processes and performance sensitivity analysis
- Source :
- IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics. Dec, 2008, Vol. 38 Issue 6, p1645, 7 p.
- Publication Year :
- 2008
-
Abstract
- The sensitivity-based optimization of Markov systems has become an increasingly important area. From the perspective of performance sensitivity analysis, policy-iteration algorithms and gradient estimation methods can be directly obtained for Markov decision processes (MDPs). In this correspondence, the sensitivity-based optimization is extended to average reward partially observable MDPs (POMDPs). We derive the performance-difference and performance-derivative formulas of POMDPs. On the basis of the performance-derivative formula, we present a new method to estimate the performance gradients. From the performance-difference formula, we obtain a sufficient optimality condition without the discounted reward formulation. We also propose a policy-iteration algorithm to obtain a nearly optimal finite-state-controller policy. Index Terms--Finite-state controller (FSC), gradient estimation, partially observable Markov decision processes (POMDPs), policy iteration, sensitivity analysis.
- Subjects :
- Algorithm
Markov processes -- Analysis
Algorithms -- Analysis
Subjects
Details
- Language :
- English
- ISSN :
- 10834419
- Volume :
- 38
- Issue :
- 6
- Database :
- Gale General OneFile
- Journal :
- IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics
- Publication Type :
- Academic Journal
- Accession number :
- edsgcl.190149441