Back to Search Start Over

Local Glivenko-Cantelli

Authors :
Cohen, Doron
Kontorovich, Aryeh
Publication Year :
2022

Abstract

If $\mu$ is a distribution over the $d$-dimensional Boolean cube $\{0,1\}^d$, our goal is to estimate its mean $p\in[0,1]^d$ based on $n$ iid draws from $\mu$. Specifically, we consider the empirical mean estimator $\hat p_n$ and study the expected maximal deviation $\Delta_n=\mathbb{E}\max_{j\in[d]}|\hat p_n(j)-p(j)|$. In the classical Universal Glivenko-Cantelli setting, one seeks distribution-free (i.e., independent of $\mu$) bounds on $\Delta_n$. This regime is well-understood: for all $\mu$, we have $\Delta_n\lesssim\sqrt{\log(d)/n}$ up to universal constants, and the bound is tight. Our present work seeks to establish dimension-free (i.e., without an explicit dependence on $d$) estimates on $\Delta_n$, including those that hold for $d=\infty$. As such bounds must necessarily depend on $\mu$, we refer to this regime as {\em local} Glivenko-Cantelli (also known as $\mu$-GC), and are aware of very few previous bounds of this type -- which are either ``abstract'' or quite sub-optimal. Already the special case of product measures $\mu$ is rather non-trivial. We give necessary and sufficient conditions on $\mu$ for $\Delta_n\to0$, and calculate sharp rates for this decay. Along the way, we discover a novel sub-gamma-type maximal inequality for shifted Bernoullis, of independent interest.

Subjects

Subjects :
Mathematics - Probability

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2209.04054
Document Type :
Working Paper