On the Existence of a Complexity in Fixed Budget Bandit Identification

2 Algorithmic classes

We introduce several algorithm classes for which we will ask whether a complexity exists . We denote by \(\mathcal C_\infty \) the class of all algorithm families.

Static proportions

Static proportions algorithms pull all arms according to a pre-defined allocation vector in the simplex, then return the empirical correct answer. That is, \(\hat{i}_T = i^\star (\hat{\mu }_T)\). Let \(\triangle _K^0 = \{ \omega \in \triangle _K \mid \forall k \in [K], \ \omega _k {\gt} 0\} \). A static proportions algorithm parametrized by \(\omega \in \triangle _K^0\) is any sampling rule which satisfies \(\vert N_{T,k} - T \omega _k \vert \le K\) for all \(k \in [K]\). Such a sampling rule exists: see the tracking procedure of [ GK16 ] , and the bound on the difference \(\vert N_{T,k} - T \omega _k \vert \) for that procedure derived by [ DSK20 ] .

Let \(\mathrm{Alt}(\mu ) = \{ \lambda \in \mathcal D \mid i^\star (\lambda ) \ne i^\star (\mu )\} \) be the set of alternatives to \(\mu \in \mathcal D\). For \(\lambda _k, \mu _k\) two means of distributions in an exponential family, we denote by \(\mathrm{KL}(\lambda _k, \mu _k)\) the Kullback-Leibler divergence between the two corresponding distributions. We give now a bound on the probability of error of static proportions algorithms, which is adapted from [ GJ04 ] .

Theorem 1

Let \(\mathcal A_\omega ^{sp}\) be a static proportions algorithm parametrized by \(\omega \in \triangle _K^0\). For all \(\mu \in \mathcal D\),

\begin{align*} \lim _{T \to +\infty }\rho _{\mu , T}(\mathcal A^{sp}_\omega ) = \Big(\inf _{\lambda \in \mathrm{Alt}(\mu )} \sum _{k \in [K]} \omega _k \mathrm{KL}(\lambda _k, \mu _k) \Big)^{-1} \: . \end{align*}
As a consequence, the oracle difficulty of the class \(\mathcal C^{sp}\) of static proportions algorithms is

\begin{align*} H_{\mathcal C^{sp}}(\mu ) = \inf _{\omega \in \triangle _K^0} \lim _{T \to +\infty }\rho _{\mu , T}(\mathcal A^{sp}_\omega ) = \Big( \max _{\omega \in \triangle _K^0} \inf _{\lambda \in \mathrm{Alt}(\mu )} \sum _{k \in [K]} \omega _k \mathrm{KL}(\lambda _k, \mu _k) \Big)^{-1} \: . \end{align*}

Consistent and exponentially consistent

An algorithm family is said to be consistent [ KCG16 ] if for all \(\mu \in \mathcal D\), \(\lim _{T\to + \infty } p_{\mu , T} = 0\). We denote that class by \(\mathcal C_c\). It is said to be exponentially consistent [ BGS22 ] if for all \(\mu \in \mathcal D\), \(\limsup _{T \to +\infty } \rho _{\mu , T}(\mathcal A) {\lt} +\infty \). We denote that class \(\mathcal C_{ec}\). Consistent algorithms the largest class of algorithm families which are “good everywhere”, in the sense that they eventually get the right answer with high probability, no matter which problem \(\mu \in \mathcal D\) they face. Any exponentially consistent algorithm is consistent: \(\mathcal C_{ec} \subseteq \mathcal C_c\). Static proportions algorithms are exponentially consistent: \(\mathcal C^{sp} \subseteq \mathcal C_{ec}\). Indeed for any \(\omega \in \triangle _K^0\), under Assumption 1 the formula for \(\lim _{T \to + \infty } \rho _{\mu , T}(\mathcal A_\omega ^{sp})\) of Theorem 1 gives a finite value. This proves that \(\mathcal A_\omega ^{sp} \in \mathcal C_{ec}\) for all \(\omega \in \triangle _K^0\). We restricted the static proportions to \(\triangle _K^0\) instead of \(\triangle _K\) to ensure that the algorithms are exponentially consistent.

Bounded difficulty

The approach of most fixed budget papers, which is however often not explicitly stated like this, is to suppose that some function \(H: \mathcal D \to \mathbb {R}\) represents a complexity of the fixed budget identification task and to look for algorithms that have error probability close to \(\exp (-T/H(\mu ))\). Such a function can be for example \(H_1(\mu ) = \sum _{k \ne i^\star (\mu )} (\mu _{i^\star (\mu )} - \mu _k)^{-2}\) for best arm identification. [ KTH22 ] make that approach explicit, in which a possibly arbitrary function \(H\) is considered and where we are interested in the following class.

\begin{align*} \mathcal C(H) & = \{ \mathcal A \mid \exists R \in \mathbb {R}, \: \forall \mu \in \mathcal D, \ \limsup _{T \to \infty }\rho _{\mu , T}(\mathcal A) \le R H(\mu )\} = \{ \mathcal A \mid \sup _{\mu \in \mathcal D} R_{H,\infty }(\mathcal A,\mu ) {\lt} +\infty \} \: . \end{align*}

We don’t allow \(H\) to be infinite in \(\mathcal D\), which means in particular that \(\mathcal C(H) \subseteq \mathcal C_{ec}\) for all \(H\). Of course if \(H\) is chosen badly that class will be empty. The goal of [ KTH22 ] is then to design algorithms which get the smallest maximal difficulty ratio, given an arbitrary function \(H\). They derive a theoretical algorithm for which the ratio approaches a proxy of the lower bound (but which is computationally intractable), and introduce a second heuristic based on neural networks.

Given an algorithm class \(\mathcal C'\), we will consider its oracle difficulty \(H_{\mathcal C'}\) and then the class \(\mathcal C(H_{\mathcal C'})\) of algorithms with bounded difficulty ratio with respect to \(H_{\mathcal C'}\). We denote \(\mathcal C(H_{\mathcal C'})\) by \(\overline{\mathcal C'}\). The class \(\overline{\mathcal C'}\) might not contain \(\mathcal C'\). If \(\mathcal C' \subseteq \mathcal C''\), then from the definition \(\overline{\mathcal C''} \subseteq \overline{\mathcal C'}\). The class of static proportions satisfies \(\mathcal C^{sp} \subseteq \overline{\mathcal C^{sp}}\). The proof is a simple study of the ratio between \(\lim _{T \to + \infty } \rho _{\mu , T}(\mathcal A_\omega ^{sp})\) for different values of \(\omega \). See the proof of Theorem 4 in Section 4.

Within a constant of the uniform allocation

The uniform static proportions algorithm \(\mathcal A_u := \mathcal A_{(1/K, \ldots , 1/K)}^{sp} \in \mathcal C^{sp}\), that allocates an equal number of samples to every arm, is a natural baseline to which we can compare algorithms. We can for example look for algorithms that have a difficulty ratio to the complexity of the uniform allocation which is uniformly bounded on \(\mathcal D\). This is the class \(\overline{\{ \mathcal A_u\} } = \mathcal C(H_{\{ \mathcal A_u\} })\). Since \(\mathcal C^{sp} \subseteq \overline{\mathcal C^{sp}}\) and \(\{ \mathcal A_u\} \subseteq \mathcal C^{sp}\), that class satisfies \(\mathcal C^{sp} \subseteq \overline{\mathcal C^{sp}} \subseteq \overline{\{ \mathcal A_u\} }\) .

Summary

Consistent, exponentially consistent algorithms and the class of algorithm families within a constant of the uniform allocation all contain the static proportions algorithms \(\mathcal C^{sp}\) : \(\mathcal C^{sp} \subseteq \mathcal C_{ec} \subseteq \mathcal C_c\) and \(\mathcal C^{sp} \subseteq \overline{\{ \mathcal A_u\} }\). If we get a lower bound on \(R_{H_{\mathcal C^{sp}}, T}(\mathcal A, \mu )\) for an algorithm family \(\mathcal A\), then it is also a lower bound for the ratio to the difficulty of any of the classes \(\mathcal C_{c}\), \(\mathcal C_{ec}\), \(\overline{\{ \mathcal A_u\} }\).