On the Existence of a Complexity in Fixed Budget Bandit Identification

Bibliography

ABM10

Jean-Yves Audibert, Sébastien Bubeck, and Rémi Munos. Best arm identification in multi-armed bandits. In COLT, pages 41–53. Citeseer, 2010.

ACD21

Ayya Alieva, Ashok Cutkosky, and Abhimanyu Das. Robust pure exploration in linear bandits with limited budget. In International Conference on Machine Learning, pages 187–195. PMLR, 2021.

AKG21

Mohammad Javad Azizi, Branislav Kveton, and Mohammad Ghavamzadeh. Fixed-budget best-arm identification in structured bandits. arXiv preprint arXiv:2106.04763, 2021.

AKK\(^{+}\)21

Kaito Ariu, Masahiro Kato, Junpei Komiyama, Kenichiro McAlinn, and Chao Qin. Policy choice and best arm identification: Asymptotic analysis of exploration sampling. arXiv preprint arXiv:2109.08229, 2021.

AKSK22

Alexia Atsidakou, Sumeet Katariya, Sujay Sanghavi, and Branislav Kveton. Bayesian fixed-budget best-arm identification. arXiv preprint arXiv:2211.08572, 2022.

BCB\(^{+}\)12

Sébastien Bubeck, Nicolo Cesa-Bianchi, et al. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning, 5(1):1–122, 2012.

BGS22

Antoine Barrier, Aurélien Garivier, and Gilles Stoltz. On best-arm identification with a fixed budget in non-parametric multi-armed bandits. arXiv preprint arXiv:2210.00895, 2022.

BMS09

Sébastien Bubeck, Rémi Munos, and Gilles Stoltz. Pure exploration in multi-armed bandits problems. In Algorithmic Learning Theory: 20th International Conference, ALT 2009, Porto, Portugal, October 3-5, 2009. Proceedings 20, pages 23–37. Springer, 2009.

CL16

Alexandra Carpentier and Andrea Locatelli. Tight (lower) bounds for the fixed budget best arm identification bandit problem. In Conference on Learning Theory, pages 590–604. PMLR, 2016.

CMC21

James Cheshire, Pierre Ménard, and Alexandra Carpentier. Problem dependent view on structured thresholding bandit problems. In International Conference on Machine Learning, pages 1846–1854. PMLR, 2021.

DK19

Rémy Degenne and Wouter M Koolen. Pure exploration with multiple correct answers. Advances in Neural Information Processing Systems, 32, 2019.

DSK20

Rémy Degenne, Han Shao, and Wouter Koolen. Structure adaptive algorithms for stochastic bandits. In International Conference on Machine Learning, pages 2443–2452. PMLR, 2020.

EDMMM06

Eyal Even-Dar, Shie Mannor, Yishay Mansour, and Sridhar Mahadevan. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of machine learning research, 7(6), 2006.

GGL12

Victor Gabillon, Mohammad Ghavamzadeh, and Alessandro Lazaric. Best arm identification: A unified approach to fixed budget and fixed confidence. Advances in Neural Information Processing Systems, 25, 2012.

GJ04

Peter Glynn and Sandeep Juneja. A large deviations perspective on ordinal optimization. In Proceedings of the 2004 Winter Simulation Conference, 2004., volume 1. IEEE, 2004.

GK16

Aurélien Garivier and Emilie Kaufmann. Optimal best arm identification with fixed confidence. In Conference on Learning Theory, pages 998–1027. PMLR, 2016.

GMS19

Aurélien Garivier, Pierre Ménard, and Gilles Stoltz. Explore first, exploit next: The true shape of regret in bandit problems. Mathematics of Operations Research, 44(2):377–399, 2019.

KCG16

Emilie Kaufmann, Olivier Cappé, and Aurélien Garivier. On the complexity of best-arm identification in multi-armed bandit models. The Journal of Machine Learning Research, 17(1):1–42, 2016.

KKG18

Emilie Kaufmann, Wouter M Koolen, and Aurélien Garivier. Sequential test for the lowest mean: From thompson to murphy sampling. Advances in Neural Information Processing Systems, 31, 2018.

KKS13

Zohar Karnin, Tomer Koren, and Oren Somekh. Almost optimal exploration in multi-armed bandits. In International Conference on Machine Learning, pages 1238–1246. PMLR, 2013.

KTH22

Junpei Komiyama, Taira Tsuchiya, and Junya Honda. Minimax optimal algorithms for fixed-budget best arm identification. In Advances in Neural Information Processing Systems, 2022.

LGC16

Andrea Locatelli, Maurilio Gutzeit, and Alexandra Carpentier. An optimal algorithm for the thresholding bandit problem. In International Conference on Machine Learning, pages 1690–1698. PMLR, 2016.

LS20

Tor Lattimore and Csaba Szepesvári. Bandit algorithms. Cambridge University Press, 2020.

ODGP21

Reda Ouhamma, Rémy Degenne, Pierre Gaillard, and Vianney Perchet. Online sign identification: Minimization of the number of errors in thresholding bandits. In NeurIPS 2021-35th International Conference on Neural Information Processing Systems, pages 1–25, 2021.

Qin22

Chao Qin. Open problem: Optimal best arm identification with fixed-budget. In Conference on Learning Theory, pages 5650–5654. PMLR, 2022.

YT22

Junwen Yang and Vincent Tan. Minimax optimal fixed-budget best arm identification in linear bandits. In Advances in Neural Information Processing Systems, 2022.