Bibliography
- ABM10
Jean-Yves Audibert, Sébastien Bubeck, and Rémi Munos. Best arm identification in multi-armed bandits. In COLT, pages 41–53. Citeseer, 2010.
- ACD21
Ayya Alieva, Ashok Cutkosky, and Abhimanyu Das. Robust pure exploration in linear bandits with limited budget. In International Conference on Machine Learning, pages 187–195. PMLR, 2021.
- AKG21
Mohammad Javad Azizi, Branislav Kveton, and Mohammad Ghavamzadeh. Fixed-budget best-arm identification in structured bandits. arXiv preprint arXiv:2106.04763, 2021.
- AKK\(^{+}\)21
Kaito Ariu, Masahiro Kato, Junpei Komiyama, Kenichiro McAlinn, and Chao Qin. Policy choice and best arm identification: Asymptotic analysis of exploration sampling. arXiv preprint arXiv:2109.08229, 2021.
- AKSK22
Alexia Atsidakou, Sumeet Katariya, Sujay Sanghavi, and Branislav Kveton. Bayesian fixed-budget best-arm identification. arXiv preprint arXiv:2211.08572, 2022.
- BCB\(^{+}\)12
Sébastien Bubeck, Nicolo Cesa-Bianchi, et al. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning, 5(1):1–122, 2012.
- BGS22
Antoine Barrier, Aurélien Garivier, and Gilles Stoltz. On best-arm identification with a fixed budget in non-parametric multi-armed bandits. arXiv preprint arXiv:2210.00895, 2022.
- BMS09
Sébastien Bubeck, Rémi Munos, and Gilles Stoltz. Pure exploration in multi-armed bandits problems. In Algorithmic Learning Theory: 20th International Conference, ALT 2009, Porto, Portugal, October 3-5, 2009. Proceedings 20, pages 23–37. Springer, 2009.
- CL16
Alexandra Carpentier and Andrea Locatelli. Tight (lower) bounds for the fixed budget best arm identification bandit problem. In Conference on Learning Theory, pages 590–604. PMLR, 2016.
- CMC21
James Cheshire, Pierre Ménard, and Alexandra Carpentier. Problem dependent view on structured thresholding bandit problems. In International Conference on Machine Learning, pages 1846–1854. PMLR, 2021.
- DK19
Rémy Degenne and Wouter M Koolen. Pure exploration with multiple correct answers. Advances in Neural Information Processing Systems, 32, 2019.
- DSK20
Rémy Degenne, Han Shao, and Wouter Koolen. Structure adaptive algorithms for stochastic bandits. In International Conference on Machine Learning, pages 2443–2452. PMLR, 2020.
- EDMMM06
Eyal Even-Dar, Shie Mannor, Yishay Mansour, and Sridhar Mahadevan. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of machine learning research, 7(6), 2006.
- GGL12
Victor Gabillon, Mohammad Ghavamzadeh, and Alessandro Lazaric. Best arm identification: A unified approach to fixed budget and fixed confidence. Advances in Neural Information Processing Systems, 25, 2012.
- GJ04
Peter Glynn and Sandeep Juneja. A large deviations perspective on ordinal optimization. In Proceedings of the 2004 Winter Simulation Conference, 2004., volume 1. IEEE, 2004.
- GK16
Aurélien Garivier and Emilie Kaufmann. Optimal best arm identification with fixed confidence. In Conference on Learning Theory, pages 998–1027. PMLR, 2016.
- GMS19
Aurélien Garivier, Pierre Ménard, and Gilles Stoltz. Explore first, exploit next: The true shape of regret in bandit problems. Mathematics of Operations Research, 44(2):377–399, 2019.
- KCG16
Emilie Kaufmann, Olivier Cappé, and Aurélien Garivier. On the complexity of best-arm identification in multi-armed bandit models. The Journal of Machine Learning Research, 17(1):1–42, 2016.
- KKG18
Emilie Kaufmann, Wouter M Koolen, and Aurélien Garivier. Sequential test for the lowest mean: From thompson to murphy sampling. Advances in Neural Information Processing Systems, 31, 2018.
- KKS13
Zohar Karnin, Tomer Koren, and Oren Somekh. Almost optimal exploration in multi-armed bandits. In International Conference on Machine Learning, pages 1238–1246. PMLR, 2013.
- KTH22
Junpei Komiyama, Taira Tsuchiya, and Junya Honda. Minimax optimal algorithms for fixed-budget best arm identification. In Advances in Neural Information Processing Systems, 2022.
- LGC16
Andrea Locatelli, Maurilio Gutzeit, and Alexandra Carpentier. An optimal algorithm for the thresholding bandit problem. In International Conference on Machine Learning, pages 1690–1698. PMLR, 2016.
- LS20
Tor Lattimore and Csaba Szepesvári. Bandit algorithms. Cambridge University Press, 2020.
- ODGP21
Reda Ouhamma, Rémy Degenne, Pierre Gaillard, and Vianney Perchet. Online sign identification: Minimization of the number of errors in thresholding bandits. In NeurIPS 2021-35th International Conference on Neural Information Processing Systems, pages 1–25, 2021.
- Qin22
Chao Qin. Open problem: Optimal best arm identification with fixed-budget. In Conference on Learning Theory, pages 5650–5654. PMLR, 2022.
- YT22
Junwen Yang and Vincent Tan. Minimax optimal fixed-budget best arm identification in linear bandits. In Advances in Neural Information Processing Systems, 2022.