Back to Search Start Over

Risk-Averse Stochastic Convex Bandit

Authors :
Cardoso, Adrian Rivera
Xu, Huan
Publication Year :
2018
Publisher :
arXiv, 2018.

Abstract

Motivated by applications in clinical trials and finance, we study the problem of online convex optimization (with bandit feedback) where the decision maker is risk-averse. We provide two algorithms to solve this problem. The first one is a descent-type algorithm which is easy to implement. The second algorithm, which combines the ellipsoid method and a center point device, achieves (almost) optimal regret bounds with respect to the number of rounds. To the best of our knowledge this is the first attempt to address risk-aversion in the online convex bandit problem.

Details

Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....5c9cfe1e72f784247a360bdcb742ce8e
Full Text :
https://doi.org/10.48550/arxiv.1810.00737