Sequential Learning

A course about reinforcement learning and bandits.

Lecture 1 - Reinforcement learning
Lecture 2 - Dynamic Programming
Lecture 3 - Reinforcement Learning Algorithms
Practical Session 1 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, December 6, 11:59 AM CET.
Lecture 4 - Reinforcement Learning with Function Approximation
Lecture 4.5 - Summary of the first 4 courses
Lecture 5 - Beyond Value-Based Methods
Practical Session 2 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, January 10, 15:00 CET.
Lecture 6 - Multi-armed bandits
Practical Session 3 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, January 31, 15:00 CET.
Lecture 7 - Best arm identification in bandits
Practical Session 4 - installation readme (not the same as the previous ones!) and notebook (You can also use Google colab: ). Deadline: Friday, February 7, 15:00 CET.
Lecture 8 - Bandit tools for reinforcement Learning

Bandit Algorithms. Tor Lattimore and Csaba Szepesvari (2019).
Reinforcement Learning. Richard Sutton and Andrew Barto (2018 edition).
Reinforcement Learning Algorithms. Csaba Szepesvari (2009).
Markov Decision Processes. Martin Puterman (1994).
Lecture notes of similar courses written by several other researchers: Emilie Kaufmann, Rémi Munos, Alessandro Lazaric and Aurélien Garivier.