Sequential learning
A course about reinforcement learning and bandits.- Lecture 1 - Reinforcement learning
- Lecture 2 - Dynamic Programming
- Lecture 3 - Reinforcement Learning Algorithms
- Practical Session 1 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, December 8, 11:59 AM CET.
- Lecture 4 - Reinforcement Learning with Function Approximation
- Lecture 4.5 - Summary of the first 4 courses
- Lecture 5 - Beyond Value-Based Methods
- Practical Session 2 - installation readme and notebook. (You can also use Google colab: ) Deadline: Wednesday, January 10, 15:00 CET.
- Lecture 6 - Multi-armed bandits
- Practical Session 3 - installation readme and notebook. (You can also use Google colab: ) Deadline: Tuesday, January 16, 15:00 CET.
- Lecture 7 - Best arm identification in bandits
- Practical Session 4 - installation readme (not the same as the previous ones!) and notebook (You can also use Google colab: ). Deadline: Tuesday, January 23, 15:00 CET.
- Lecture 8 - Bandit tools for reinforcement Learning
References
- Bandit Algorithms. Tor Lattimore and Csaba Szepesvari (2019).
- Reinforcement Learning. Richard Sutton and Andrew Barto (2018 edition).
- Reinforcement Learning Algorithms. Csaba Szepesvari (2009).
- Markov Decision Processes. Martin Puterman (1994).
- Lecture notes of similar courses written by several colleagues: Emilie Kaufmann, Rémi Munos, Alessandro Lazaric and Aurélien Garivier.