Sequential learning
A course about reinforcement learning and bandits.- Lecture 1 - Reinforcement learning
- Lecture 2 - Dynamic Programming
References
- Bandit Algorithms. Tor Lattimore and Csaba Szepesvari (2019).
- Reinforcement Learning. Richard Sutton and Andrew Barto (2018 edition).
- Reinforcement Learning Algorithms. Csaba Szepesvari (2009).
- Markov Decision Processes. Martin Puterman (1994).
- Lecture notes of similar courses written by several other researchers: Emilie Kaufmann, Rémi Munos, Alessandro Lazaric and Aurélien Garivier.