On the Existence of a Complexity in Fixed Budget Bandit Identification

4 The range of the difficulty ratio

In asymptotic fixed confidence, the complexity of \(\delta \)-correct algorithm is given by the oracle difficulty of static proportions. There is an optimal sampling allocation at each \(\mu \in \mathcal D\), and the best any adaptive algorithm can do is match the performance of that allocation. The fixed confidence analogue of the difficulty ratio would be greater than \(1\) for any \(\delta \)-correct algorithm, and exactly \(1\) for TnS. We hence focus on the ratio of fixed budget algorithm families to the oracle difficulty of the class of static proportions algorithms, \(H_{\mathcal C^{sp}}(\mu ) = \left( \max _{\omega \in \triangle _k} \inf _{\lambda \in \mathrm{Alt}(\mu )} \sum _{k=1}^K \omega _k \mathrm{KL}(\lambda _k, \mu _k) \right)^{-1}\). In a general fixed budget identification task \((\mathcal D, \mathcal I, i^\star )\), two related questions remain open:

  • Do fixed proportions indeed always define oracle algorithms, or could there exist an adaptive algorithm with a better rate everywhere? In technical terms, can we have the inequality \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}},\infty }(\mathcal A, \lambda ) {\lt} 1\) ? [ ODGP21 ] exhibit a setting close to fixed budget identification in which an adaptive algorithm can indeed beat any static proportions algorithm. However, their objective does not fit into our fixed budget identification framework and their example uses families of distributions in which the KL can be infinite.

  • For Bernoulli BAI, a lower bound of [ CL16 ] and the upper bound on the Successive Rejects algorithm of [ ABM10 ] together show that for \(H_1\) the sum of inverse squared gaps, the value \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_1,\infty }(\mathcal A, \lambda )\) is of order \(\log K\), strictly greater than 1 for \(K\) large enough. Do we have the same bound for \(\mathcal H_{\mathcal C^{sp}}\) and are there problems on which the difficulty ratio can be much larger than \(\log K\)?

We study the possible values for the smallest maximal difficulty ratio over all algorithm: we prove upper and lower bounds on \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}},\infty }(\mathcal A, \lambda ) \) when we vary the task \((\mathcal D, \mathcal I, i^\star )\).

4.1 Upper bound

We first prove that \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda ) \le K\) on any task \((\mathcal D, \mathcal I, i^\star )\) by showing that uniform sampling can be worse than the oracle static proportions by a factor of at most \(K\). We then exhibit a task on which there is equality.

Theorem 4

For all \(\omega \in \triangle _K^0\), the static proportions algorithm \(\mathcal A_\omega ^{sp}\) belongs to \(\overline{\mathcal C^{sp}}\) and satisfies \(\sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A_\omega ^{sp}, \lambda ) \le (\min _{j \in [K]} \omega _j)^{-1}\). In particular, for \(A_u \in \mathcal C^{sp}\) the uniform sampling algorithm (static proportions with proportion \(1/K\) for all arms), \( \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A_u, \lambda ) \le K \: . \)

Proof

Let \(\omega ^\star (\mu ) \in \triangle _K\) be the oracle static proportions at \(\mu \) and let \(\omega \in \triangle _K^0\). Then for all \(k\), \(\omega _k \ge \omega ^\star _k(\mu ) \min _j \omega _j\) and, using Theorem 1,

\begin{align*} \limsup _{T \to +\infty } \rho _{\mu , T}(\mathcal A_\omega ^{sp}) & \le \frac{1}{\min _{j \in [K]} \omega _j}\left( \inf _{\lambda \in \mathrm{Alt}(\mu )} \sum _{k=1}^K \omega _k^\star (\mu ) \mathrm{KL}(\lambda _k, \mu _k) \right)^{-1} = \frac{1}{\min _{j \in [K]} \omega _j} H_{\mathcal C^{sp}}(\mu ) \: . \end{align*}

We proved that \(\limsup _{T \to +\infty } R_{H_{\mathcal C^{sp}}, T}(\mathcal A_\omega ^{sp}, \mu ) \le (\min _{j \in [K]} \omega _j)^{-1}\) for all \(\mu \in \mathcal D\).

Of course there are tasks for which uniform sampling is not the best algorithm: for Gaussian BAI the Successive-Rejects algorithm [ ABM10 ] has a ratio of order \(\log K\) (see also [ BGS22 ] ). However, in some identification tasks \(K\) is the best achievable ratio.

Theorem 5

On the Positivity problem, where we check whether there is an arm with mean lower than a threshold \(\theta \), \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda ) = K\).

That theorem proves that on the positivity problem, if a class contains the static proportions algorithms then it does not have a complexity. Furthermore, the uniform sampling algorithm is optimal for the criterion \(\sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda )\).

Proof

Let \(\mathcal A\) be any algorithm family. We use Corollary 1 for \(\mu \) a tuple of \(K\) times the same distribution with mean \(m {\gt} \theta \). Either \(R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \mu ) = + \infty \) and the lower bound is obvious or we can apply the corollary. For \(j \in [K]\), we define \(\lambda ^{(j)}\) identical to \(\mu \) except for \(\lambda ^{(j)}_j = \ell {\lt} \theta \). Then \( \max _{j \in [K]} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda ^{(j)}) \ge \sum _{j = 1}^K (H_{\mathcal C^{sp}}(\lambda ^{(j)}) \mathrm{KL}(m, \ell ))^{-1}\) . Now for all \(j\), a simple computation gives \(H_{\mathcal C^{sp}}(\lambda ^{(j)}) = (\mathrm{KL}(\theta , \ell ))^{-1}\), such that the lower bound is \(K \mathrm{KL}(\theta , \ell ) / \mathrm{KL}(m, \ell )\). When \(\ell \) tends to the lower bound of the means in the exponential family, the KL ratio tends to 1.

The proof of Theorem 5 exhibits \(K\) problems, each with a different arm with mean below the threshold, and the oracle algorithm for each samples only that arm. The lower bound shows that detecting which arm is below the threshold is harder than the identification task and that no matter the algorithm, it is as bad as uniform sampling on one of the problems (but we don’t know which).

We established that the highest possible value for identification tasks \((\mathcal D, \mathcal I, i^\star )\) of the quantity \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda )\) is \(K\), and that this value is attained for the Positivity problem.

4.2 Lower bounds

We turn our attention to lower bounds. A natural conjecture is the following: for all fixed budget tasks and all algorithm families, \(\sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda ) \ge 1\). If true, then no adaptive algorithm that can do everywhere better than the static proportions oracle. It could still have lower error probability on one problem \(\mu \in \mathcal D\), but would have to be worse somewhere else. First, we prove the conjecture for Gaussian half-space identification (Lemma 4 in Appendix C). In that task, there are two answers and \(i^\star \) has a different value on each side of a hyperplane. We then extend that result to Gaussian distributions with piecewise linear boundaries between the answer sets.

Theorem 6

Suppose that there is an \(L^2\) ball \(B(\eta , r)\) with center \(\eta \in \mathrm{cl}(\mathcal D)\) and radius \(r {\gt} 0\) such that \(i^\star \) takes only two values in \(B(\eta , r)\), say \(i\) and \(j\), and the boundary between \(B(\eta , r) \cap \{ \mu \mid i^\star (\mu ) = i\} \) and \(B(\eta , r) \cap \{ \mu \mid i^\star (\mu ) = j\} \) is the restriction of a hyperplane passing through \(\eta \). Then for Gaussian arms (each with a known but possibly different variance), the lowest maximal difficulty ratio is \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda ) \ge 1\).

The idea of the proof is the following: if we consider \(\lambda \in \mathcal D\) close to the center of the ball, then the oracle difficulty \(H_{\mathcal C^{sp}}(\lambda )\) of static proportions for our task is the same as for half-space identification. Then if we choose \(\mu \) even closer to the center, we can apply Theorem 3 to a set \(D(\mu )\) of points for which this equality holds. Up to border effects that disappear when \(\mu \) get closer to the center, we get the same lower bound as for half-space identification. Full proof in Appendix C.

The hypothesis of that lemma applies to all examples of fixed budget identification we introduced. Indeed BAI, Thresholding bandits and Positivity all have piecewise linear boundaries. More generally, we could extend Theorem 6 to tasks in which the boundary has bounded curvature at some point: we can zoom in on that point and find problems for which we recover the half-space bound. This remark also illustrates the limitation of Theorem 6: it is asymptotic in nature. The proof requires points that are much closer to the center of the ball than the radius. Either we need a very large ball (BAI when the two best arms have much higher means than other arms) or we need problems very close to the boundary. It should be possible to extend the theorem to any exponential family by using that locally the KL is quadratic. Again, we would describe the asymptotic behavior of an algorithm family on problems very close to a given boundary point.

The lower bound \(\inf _{\mathcal A \in \mathcal C_\infty } \sup _{\lambda \in \mathcal D} R_{H_{\mathcal C^{sp}}, \infty }(\mathcal A, \lambda ) \ge 1\) shows that if a class \(\mathcal C\) contains \(\mathcal C^{sp}\) and admits a complexity, then that complexity has to be \(H_{\mathcal C^{sp}}\).