Journal: ieee transactions on automatic control / Topic: bayesian learning - Searchworks@Jio Institute Digital Library Search Results

Showing total 4 results

Start Over Topic bayesian learning Journal ieee transactions on automatic control

4 results

1. Thompson Sampling for Stochastic Control: The Finite Parameter Case.

Author: Kim, Michael Jong
Subjects: BAYESIAN analysis, STOCHASTIC analysis, MARKOV processes, DYNAMIC programming, ASYMPTOTIC efficiencies
Abstract: In this paper, we apply Thompson sampling to a class of average reward stochastic control problems with parameter uncertainty. Specifically, we study an average reward stochastic control problem over an infinite horizon in which both the reward and state transition distributions are parameterized by an unknown parameter taking values in a finite space. The main result of this paper is a proof showing that Thompson sampling achieves a worst case average per period regret of O(T^-1), which is asymptotically optimal. [ABSTRACT FROM PUBLISHER]
Published: 2017
Full Text: View/download PDF

2. Thompson Sampling for Stochastic Control: The Continuous Parameter Case.

Author: Banjevic, Dragan and Kim, Michael Jong
Subjects: COST control, UNCERTAIN systems, MARKOV processes, SAMPLING errors
Abstract: Recently, Thompson sampling has been shown to achieve good theoretical performance guarantees for stochastic control problems with parameter uncertainty when the state, control, and parameter spaces are all finite. Much less is known however about the performance of Thompson sampling when applied to continuous or more general spaces, which constitutes an important class of problems in practice. In this paper, we study Thompson sampling when applied to a broad class of average cost stochastic control problems where the state, control, and parameter spaces are all general measurable spaces. The main contributions of our paper are establishing theoretical performance guarantees for Thompson sampling as measured by: first, expected posterior sampling error; and second, average per period regret. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

3. Informational Cascades With Nonmyopic Agents.

Author: Bistritz, Ilai, Heydaribeni, Nasimeh, and Anastasopoulos, Achilleas
Subjects: INFORMATION storage & retrieval systems, PRODUCT quality, EQUILIBRIUM
Abstract: We consider an environmentwhere players need to decide whether to buy a certain product (or adopt a technology) or not. The product is either good or bad, but its true value is unknown to the players. Instead, each player has her own private information on its quality. Each player can observe the previous actions of other players and estimate the quality of the product. A classic result in the literature shows that in similar settings, informational cascades occur, where learning stops for the whole network and players repeat the actions of their predecessors. In contrast to this literature, in this paper, players get more than one opportunity to act. In each turn, a player is chosen uniformly at random from all the players and can decide to buy the product and leave the market or wait. Her utility is the total expected discounted reward, and thus, myopic strategies may not constitute equilibria. We provide a characterization of perfect Bayesian equilibria (PBEs) with forward-looking strategies through a fixed-point equation of dimensionality that grows only quadratically with the number of players. Using this tractable fixed-point equation, we show the existence of a PBE and characterize PBEs with threshold strategies. Based on this characterization, we study informational cascades in two regimes. First, we show that for a discount factor $\delta$ strictly smaller than 1, informational cascades happen with high probability as the number of players $N$ increases. Furthermore, only a small portion of the total information in the system is revealed before a cascade occurs. Second, and more surprisingly, we show that for a fixed $N$ , and for a sufficiently large $\delta < 1$ , when the product is bad, there exists an equilibrium where an informational cascade can happen only after at least half of the players revealed their private information, and consequently, the probability for a “bad cascade” where all the players buy the product vanishes exponentially with $N$. Finally, when $\delta =1$ and the product is bad, there exists an equilibrium where informational cascades do not happen at all. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. Efficient Learning for Selecting Important Nodes in Random Network.

Author: Li, Haidong, Xu, Xiaoyun, Peng, Yijie, and Chen, Chun-Hung
Subjects: MARKOV processes, TAYLOR'S series, STOCHASTIC processes, PROBABILITY theory
Abstract: In this article, we consider the problem of selecting important nodes in a random network, where the nodes connect to each other randomly with certain transition probabilities. The node importance is characterized by the stationary probabilities of the corresponding nodes in a Markov chain defined over the network, as in Google's PageRank. Unlike a deterministic network, the transition probabilities in a random network are unknown but can be estimated by sampling. Under a Bayesian learning framework, we apply the first-order Taylor expansion and normal approximation to provide a computationally efficient posterior approximation of the stationary probabilities. In order to maximize the probability of correct selection, we propose a dynamic sampling procedure, which uses not only posterior means and variances of certain interaction parameters between different nodes, but also the sensitivities of the stationary probabilities with respect to each interaction parameter. Numerical experiment results demonstrate the superiority of the proposed sampling procedure. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results

1. Thompson Sampling for Stochastic Control: The Finite Parameter Case.

2. Thompson Sampling for Stochastic Control: The Continuous Parameter Case.

3. Informational Cascades With Nonmyopic Agents.

4. Efficient Learning for Selecting Important Nodes in Random Network.

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

4 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources