Sequential Learning

A course about reinforcement learning and bandits.

Lecture 1 - Reinforcement learning
Lecture 2 - Dynamic Programming
Lecture 3 - Reinforcement Learning Algorithms
Practical Session 1 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, December 12, 11:59 AM CET.
Lecture 4 - Reinforcement Learning with Function Approximation
Lecture 4.5 - Summary of the first 4 courses
Lecture 5 - Beyond Value-Based Methods
Practical Session 2 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, January 9, 11:59 AM CET.
Lecture 6 - Multi-armed bandits
Lecture 7 - Structured bandits
Practical Session 3 - installation readme and notebook. (You can also use Google colab: ) Deadline: Friday, January 30, 11:59 AM CET.
Lecture 8 - Best arm identification
Lecture 9 - Bandit tools for reinforcement learning
Practical Session 4 - installation readme (not the same as the previous ones!) and notebook (You can also use Google colab: ). Deadline: Friday, February 13, 11:59 AM CET.

LeanBandits: a Lean project about bandit algorithms. Rémy Degenne and Paulo Rauber.
Bandit Algorithms. Tor Lattimore and Csaba Szepesvari (2019).
Reinforcement Learning. Richard Sutton and Andrew Barto (2018 edition).
Reinforcement Learning Algorithms. Csaba Szepesvari (2009).
Markov Decision Processes. Martin Puterman (1994).
Lecture notes of similar courses written by several other researchers: Emilie Kaufmann, Rémi Munos, Alessandro Lazaric and Aurélien Garivier.