Back to Search
Start Over
Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?
- Publication Year :
- 2019
-
Abstract
- Based on differential privacy (DP) framework, we introduce and unify privacy definitions for the multi-armed bandit algorithms. We represent the framework with a unified graphical model and use it to connect privacy definitions. We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We leverage a unified proving technique to achieve all the lower bounds. We show that for all of them, the learner's regret is increased by a multiplicative factor dependent on the privacy level $\epsilon$. We observe that the dependency is weaker when we do not require local differential privacy for the rewards.<br />Comment: 27 pages, 1 figure, 2 tables, 14 theorems
- Subjects :
- Computer Science - Machine Learning
Statistics - Machine Learning
62L10, 94A15
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.1905.12298
- Document Type :
- Working Paper