Documentation

LeanBandits.AlgorithmBuilding

Tools to build bandit algorithms #

noncomputable def Bandits.pullCount' {α : Type u_1} [DecidableEq α] (n : ) (h : (Finset.Iic n)α × ) (a : α) :

Number of pulls of arm a up to (and including) time n.

Equations
Instances For
    noncomputable def Bandits.sumRewards' {α : Type u_1} [DecidableEq α] (n : ) (h : (Finset.Iic n)α × ) (a : α) :

    Sum of rewards of arm a up to (and including) time n.

    Equations
    Instances For
      noncomputable def Bandits.empMean' {α : Type u_1} [DecidableEq α] (n : ) (h : (Finset.Iic n)α × ) (a : α) :

      Empirical mean of arm a at time n.

      Equations
      Instances For
        theorem Bandits.pullCount'_eq_sum {α : Type u_1} [DecidableEq α] (n : ) (h : (Finset.Iic n)α × ) (a : α) :
        pullCount' n h a = s : (Finset.Iic n), if (h s).1 = a then 1 else 0
        theorem Bandits.measurable_pullCount' {α : Type u_1} [DecidableEq α] [MeasurableSpace α] [MeasurableSingletonClass α] (n : ) (a : α) :
        Measurable fun (h : (Finset.Iic n)α × ) => pullCount' n h a
        theorem Bandits.measurable_sumRewards' {α : Type u_1} [DecidableEq α] [MeasurableSpace α] [MeasurableSingletonClass α] (n : ) (a : α) :
        Measurable fun (h : (Finset.Iic n)α × ) => sumRewards' n h a
        theorem Bandits.measurable_empMean' {α : Type u_1} [DecidableEq α] [MeasurableSpace α] [MeasurableSingletonClass α] (n : ) (a : α) :
        Measurable fun (h : (Finset.Iic n)α × ) => empMean' n h a