Back to Search Start Over

State Classification of Time-Nonhomogeneous Markov Chains and Average Reward Optimization of Multi-Chains.

Authors :
Cao, Xi-Ren
Source :
IEEE Transactions on Automatic Control. Oct2016, Vol. 61 Issue 10, p3001-3015. 15p.
Publication Year :
2016

Abstract

In a discrete time nonhomogeneous Markov chain (TNHMC), the states spaces, transition probabilities, and reward functions at different times may be different. In this paper, with the confluencity previously introduced, we show that the states of a TNHMC can be classified into the branching states and a number of classes of confluent states (versus the transient and recurrent states in the time homogeneous case). The optimization of average reward in TNHMC's consisting of a single confluent class (uni-chain) have been addressed in a previous paper by the author. In this paper, we show that with confluencity and the state classification and under some bound conditions, we can obtain the necessary and sufficient conditions for optimal policies of the average reward of TNHMCs consisting of multiple confluent classes (multi-chains). Just like in the uni-chain TNHMC case, the sufficient condition does not need to hold in any “zero frequently visited” time sequence. This “under-selectivity” makes the problem not amenable to dynamic programming. A direct comparison based approach is used to prove the results. The results enhance our understanding of state classification and performance optimization with the notion of confluencity. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00189286
Volume :
61
Issue :
10
Database :
Academic Search Index
Journal :
IEEE Transactions on Automatic Control
Publication Type :
Periodical
Accession number :
118364213
Full Text :
https://doi.org/10.1109/TAC.2015.2508739