Back to Search
Start Over
Thompson Sampling for Stochastic Control: The Finite Parameter Case.
- Source :
-
IEEE Transactions on Automatic Control . Dec2017, Vol. 62 Issue 12, p6415-6422. 8p. - Publication Year :
- 2017
-
Abstract
- In this paper, we apply Thompson sampling to a class of average reward stochastic control problems with parameter uncertainty. Specifically, we study an average reward stochastic control problem over an infinite horizon in which both the reward and state transition distributions are parameterized by an unknown parameter taking values in a finite space. The main result of this paper is a proof showing that Thompson sampling achieves a worst case average per period regret of O(T^-1), which is asymptotically optimal. [ABSTRACT FROM PUBLISHER]
Details
- Language :
- English
- ISSN :
- 00189286
- Volume :
- 62
- Issue :
- 12
- Database :
- Academic Search Index
- Journal :
- IEEE Transactions on Automatic Control
- Publication Type :
- Periodical
- Accession number :
- 126586028
- Full Text :
- https://doi.org/10.1109/TAC.2017.2653942