LeanBandits

4 Concentration inequalities

4.1 Sub-Gaussian random variables

Definition 4.1 Sub-Gaussian
#

A real valued random variable \(X\) is \(\sigma ^2\)-sub-Gaussian if for any \(\lambda \in \mathbb {R}\),

\begin{align*} \mathbb {E}\left[e^{\lambda X}\right] & \le e^{\frac{\lambda ^2 \sigma ^2}{2}} \: . \end{align*}
Lemma 4.2

If \(X\) is \(\sigma _1^2\)-sub-Gaussian and \(Y\) is \(\sigma _2^2\)-sub-Gaussian, and \(X\) and \(Y\) are independent, then \(X + Y\) is \((\sigma _1^2 + \sigma _2^2)\)-sub-Gaussian.

Proof
Lemma 4.3

For \(X\) a \(\sigma ^2\)-sub-Gaussian random variable, for any \(t \ge 0\),

\begin{align*} \mathbb {P}(X \ge t) & \le \exp \left(- \frac{t^2}{2 \sigma ^2}\right) \: . \end{align*}
Proof

Let \(X_1, \ldots , X_n\) be independent random variables such that \(X_i\) is \(\sigma _i^2\)-sub-Gaussian for \(i \in [n]\). Then for any \(t \ge 0\),

\begin{align*} \mathbb {P}\left(\sum _{i=1}^n X_i \ge t\right) & \le \exp \left(- \frac{t^2}{2 \sum _{i=1}^n \sigma _i^2}\right) \: . \end{align*}
Proof

Let \(X_1, \ldots , X_n\) be random variables such that \(X_i - P[X_i]\) is \(\sigma _{X,i}^2\)-sub-Gaussian for \(i \in [n]\). Let \(Y_1, \ldots , Y_m\) be random variables such that \(Y_i - P[Y_i]\) is \(\sigma _{Y,i}^2\)-sub-Gaussian for \(i \in [m]\). Suppose further that the vectors \(X\) and \(Y\) are independent and that \(\sum _{i = 1}^m P[Y_i] \le \sum _{i = 1}^n P[X_i]\). Then

\begin{align*} \mathbb {P}\left(\sum _{i=1}^m Y_i \ge \sum _{i=1}^n X_i\right) & \le \exp \left(- \frac{\left(\sum _{i = 1}^n P[X_i] - \sum _{i=1}^m P[Y_i]\right)^2}{2 \sum _{i=1}^n (\sigma _{X,i}^2 + \sigma _{Y,i}^2)}\right) \: . \end{align*}
Proof

4.2 Concentration of the sums of rewards in bandit models

In the array model, for \(t \in \mathbb {N}\), the random variable \((N_{t,a}, S_{t, a})_{a \in \mathcal{A}}\) has the same distribution as \((N_{t,a}, \sum _{s=0}^{N_{t,a}-1} \omega _{2, s, a})_{a \in \mathcal{A}}\).

Proof
Lemma 4.7

In the array model, for \(k \in \mathbb {N}\), the random variable \(\sum _{s=0}^{k-1} \omega _{2, s, a}\) has the same distribution as a sum of \(k\) i.i.d. random variables with law \(\nu (a)\).

Proof

By definition of \(P_{\mathcal{A}}\) (Definition 3.3).

In the array model, for \(t \in \mathbb {N}\), \(a \in \mathcal{A}\), and a measurable set \(B \subseteq \mathbb {N} \times \mathbb {R}\),

\begin{align*} P_{\mathcal{A}}\left((N_{t,a}, S_{t, a}) \in B\right) & \le \sum _{k {\lt} t, \exists r, (k, r) \in B} \nu (a)^{\otimes \mathbb {N}} \left(\sum _{s=0}^{k-1} \omega _{s} \in \{ x \mid \exists n, (n, x) \in B\} \right) \: . \end{align*}

As a consequence, this also holds for any algorithm-environment sequence.

Proof

In the array model,

\begin{align*} P_{\mathcal{A}}\left( N_{t, a^*} = m_1 \wedge N_{t, a} = m_2 \wedge S_{t, a^*} \le S_{t, a}\right) & \le (\otimes _a \nu (a))^{\otimes \mathbb {N}} \left( \sum _{s=0}^{m_1-1} \omega _{s, a^*} \le \sum _{s=0}^{m_2-1} \omega _{s, a} \right) \: . \end{align*}

As a consequence, this also holds for any algorithm-environment sequence.

Proof

4.2.1 Sub-Gaussian rewards

TODO: extend this to sub-Gaussian with constant other than 1.

Lemma 4.10
#

Let \(\nu (a)\) be a 1-sub-Gaussian distribution on \(\mathbb {R}\) for each arm \(a \in \mathcal{A}\).

\begin{align*} (\otimes _a \nu (a))^{\otimes \mathbb {N}} \left( \sum _{s=0}^{m-1} \omega _{s, a^*} \le \sum _{s=0}^{m-1} \omega _{s, a} \right) & \le \exp \left( -m \frac{\Delta _a^2}{4} \right) \end{align*}
Proof
Lemma 4.11

Let \(\nu (a)\) be a 1-sub-Gaussian distribution on \(\mathbb {R}\) for each arm \(a \in \mathcal{A}\). Let \(c \ge 0\) be a real number and \(k\) a positive natural number. Then

\begin{align*} \nu (a)^{\otimes \mathbb {N}} \left( \sum _{s=0}^{k-1} (\omega _{s} - \mu _a) \le - \sqrt{c k \log (n + 1)} \right) & \le \frac{1}{(n + 1)^{c / 2}} \: . \end{align*}

The same upper bound holds for the upper tail:

\begin{align*} \nu (a)^{\otimes \mathbb {N}} \left( \sum _{s=0}^{k-1} (\omega _{s} - \mu _a) \ge \sqrt{c k \log (n + 1)} \right) & \le \frac{1}{(n + 1)^{c / 2}} \: . \end{align*}
Proof