Bandit Algorithms in Information Retrieval.

Authors :: Głowacka, Dorota
Source :: Foundations & Trends in Information Retrieval; 2019, Vol. 13 Issue 4, p299-424, 126p
Publication Year :: 2019
Abstract: Bandit algorithms, named after casino slot machines sometimes known as “one-armed bandits”, fall into a broad category of stochastic scheduling problems. In the setting with multiple arms, each arm generates a reward with a given probability. The gambler’s aim is to find the arm producing the highest payoff and then continue playing in order to accumulate the maximum reward possible. However, having only a limited number of plays, the gambler is faced with a dilemma: should he play the arm currently known to produce the highest reward or should he keep on trying other arms in the hope of finding a better paying one? This problem formulation is easily applicable to many real-life scenarios, hence in recent years there has been an increased interest in developing bandit algorithms for a range of applications. In information retrieval and recommender systems, bandit algorithms, which are simple to implement and do not require any training data, have been particularly popular in online personalization, online ranker evaluation and search engine optimization. This survey provides a brief overview of bandit algorithms designed to tackle specific issues in information retrieval and recommendation and, where applicable, it describes how they were applied in practice. [ABSTRACT FROM AUTHOR]