Sequential learning
A course about reinforcement learning and bandits.- Lecture 1 - Reinforcement learning
- Lecture 2 - Dynamic Programming
- Lecture 3 - Reinforcement Learning Algorithms
- Practical Session 1 - installation readme and notebook. (You can also use Google colab:
) Deadline: Friday, December 12, 11:59 AM CET.
- Lecture 4 - Reinforcement Learning with Function Approximation
- Lecture 4.5 - Summary of the first 4 courses
- Lecture 5 - Beyond Value-Based Methods
References
- Bandit Algorithms. Tor Lattimore and Csaba Szepesvari (2019).
- Reinforcement Learning. Richard Sutton and Andrew Barto (2018 edition).
- Reinforcement Learning Algorithms. Csaba Szepesvari (2009).
- Markov Decision Processes. Martin Puterman (1994).
- Lecture notes of similar courses written by several other researchers: Emilie Kaufmann, Rémi Munos, Alessandro Lazaric and Aurélien Garivier.