Documentation

LeanBandits.AlgorithmBuilding

Tools to build bandit algorithms #

theorem measurableSet_isMax {𝓧 : Type u_1} {𝓨 : Type u_2} {α : Type u_3} {m𝓧 : MeasurableSpace 𝓧} { : MeasurableSpace α} [TopologicalSpace α] [LinearOrder α] [OpensMeasurableSpace α] [OrderClosedTopology α] [SecondCountableTopology α] [Countable 𝓨] {f : 𝓧𝓨α} (hf : ∀ (y : 𝓨), Measurable fun (x : 𝓧) => f x y) (y : 𝓨) :
MeasurableSet {x : 𝓧 | ∀ (z : 𝓨), f x z f x y}
theorem exists_isMaxOn' {𝓧 : Type u_1} {𝓨 : Type u_2} {α : Type u_4} [LinearOrder α] [Nonempty 𝓨] [Finite 𝓨] [Encodable 𝓨] (f : 𝓧𝓨α) (x : 𝓧) :
∃ (n : ) (y : 𝓨), n = Encodable.encode y ∀ (z : 𝓨), f x z f x y
noncomputable def measurableArgmax {𝓧 : Type u_1} {𝓨 : Type u_2} {α : Type u_3} {m𝓨 : MeasurableSpace 𝓨} [LinearOrder α] [Nonempty 𝓨] [Finite 𝓨] [Encodable 𝓨] [MeasurableSingletonClass 𝓨] (f : 𝓧𝓨α) [(x : 𝓧) → DecidablePred fun (n : ) => ∃ (y : 𝓨), n = Encodable.encode y ∀ (z : 𝓨), f x z f x y] (x : 𝓧) :
𝓨

A measurable argmax function.

Equations
Instances For
    theorem measurable_measurableArgmax {𝓧 : Type u_1} {𝓨 : Type u_2} {α : Type u_3} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : MeasurableSpace α} [TopologicalSpace α] [LinearOrder α] [OpensMeasurableSpace α] [OrderClosedTopology α] [SecondCountableTopology α] [Nonempty 𝓨] [Finite 𝓨] [Encodable 𝓨] [MeasurableSingletonClass 𝓨] {f : 𝓧𝓨α} [(x : 𝓧) → DecidablePred fun (n : ) => ∃ (y : 𝓨), n = Encodable.encode y ∀ (z : 𝓨), f x z f x y] (hf : ∀ (y : 𝓨), Measurable fun (x : 𝓧) => f x y) :
    theorem isMaxOn_measurableArgmax {𝓧 : Type u_1} {𝓨 : Type u_2} {m𝓨 : MeasurableSpace 𝓨} {α : Type u_4} [LinearOrder α] [Nonempty 𝓨] [Finite 𝓨] [Encodable 𝓨] [MeasurableSingletonClass 𝓨] (f : 𝓧𝓨α) [(x : 𝓧) → DecidablePred fun (n : ) => ∃ (y : 𝓨), n = Encodable.encode y ∀ (z : 𝓨), f x z f x y] (x : 𝓧) (z : 𝓨) :
    f x z f x (measurableArgmax f x)
    noncomputable def Bandits.pullCount' {α : Type u_1} [DecidableEq α] (n : ) (h : { x : // x Finset.Iic n }α × ) (a : α) :

    Number of pulls of arm a up to (and including) time n.

    Equations
    Instances For
      noncomputable def Bandits.sumRewards' {α : Type u_1} [DecidableEq α] (n : ) (h : { x : // x Finset.Iic n }α × ) (a : α) :

      Sum of rewards of arm a up to (and including) time n.

      Equations
      Instances For
        noncomputable def Bandits.empMean' {α : Type u_1} [DecidableEq α] (n : ) (h : { x : // x Finset.Iic n }α × ) (a : α) :

        Empirical mean of arm a at time n.

        Equations
        Instances For
          theorem Bandits.pullCount'_eq_sum {α : Type u_1} [DecidableEq α] (n : ) (h : { x : // x Finset.Iic n }α × ) (a : α) :
          pullCount' n h a = s : { x : // x Finset.Iic n }, if (h s).1 = a then 1 else 0
          theorem Bandits.measurable_pullCount' {α : Type u_1} [DecidableEq α] [MeasurableSpace α] [MeasurableSingletonClass α] (n : ) (a : α) :
          Measurable fun (h : { x : // x Finset.Iic n }α × ) => pullCount' n h a
          theorem Bandits.measurable_sumRewards' {α : Type u_1} [DecidableEq α] [MeasurableSpace α] [MeasurableSingletonClass α] (n : ) (a : α) :
          Measurable fun (h : { x : // x Finset.Iic n }α × ) => sumRewards' n h a
          theorem Bandits.measurable_empMean' {α : Type u_1} [DecidableEq α] [MeasurableSpace α] [MeasurableSingletonClass α] (n : ) (a : α) :
          Measurable fun (h : { x : // x Finset.Iic n }α × ) => empMean' n h a