Sequential learning
A course about reinforcement learning and bandits.- Lecture 1 - Reinforcement learning
- Lecture 2 - Dynamic Programming
- Lecture 3 - Reinforcement Learning Algorithms
- Practical Session 1 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, December 6, 11:59 AM CET.
- Lecture 4 - Reinforcement Learning with Function Approximation
- Lecture 4.5 - Summary of the first 4 courses
- Lecture 5 - Beyond Value-Based Methods
- Practical Session 2 - installation readme and notebook. (You can also use Google colab: ) Deadline: Wednesday, January 10, 15:00 CET.
- Lecture 6 - Multi-armed bandits
References
- Bandit Algorithms. Tor Lattimore and Csaba Szepesvari (2019).
- Reinforcement Learning. Richard Sutton and Andrew Barto (2018 edition).
- Reinforcement Learning Algorithms. Csaba Szepesvari (2009).
- Markov Decision Processes. Martin Puterman (1994).
- Lecture notes of similar courses written by several other researchers: Emilie Kaufmann, Rémi Munos, Alessandro Lazaric and Aurélien Garivier.