Round-Robin algorithm #

That algorithm chooses each of finitely many actions in a round-robin fashion. That is, if there are K actions numbered from 0 to K - 1, then at time n it chooses he action n % K.

Main definitions #

roundRobinAlgorithm: the Round-Robin algorithm.

source

theorem sum_mod_range {K : ℕ} (hK : 0 < K) (a : Fin K) :

(∑ s ∈ Finset.range K, if ⟨s % K, ⋯⟩ = a then 1 else 0) = 1

source

theorem sum_mod_range_mul {K : ℕ} (hK : 0 < K) (m : ℕ) (a : Fin K) :

(∑ s ∈ Finset.range (K * m), if ⟨s % K, ⋯⟩ = a then 1 else 0) = m

source

noncomputable def Learning.RoundRobin.nextAction {K : ℕ} (hK : 0 < K) (n : ℕ) :

Fin K

Action chosen by the Round-Robin algorithm at time n + 1. This is action (n + 1) % K.

Equations

Learning.RoundRobin.nextAction hK n = ⟨(n + 1) % K, ⋯⟩

Instances For

source

noncomputable def Learning.roundRobinAlgorithm {K : ℕ} (hK : 0 < K) :

Algorithm (Fin K) ℝ

The Round-Robin algorithm: deterministic algorithm that chooses action n % K at time n.

Equations

Learning.roundRobinAlgorithm hK = Learning.detAlgorithm (fun (n : ℕ) (x : ↥(Finset.Iic n) → Fin K × ℝ) => Learning.RoundRobin.nextAction hK n) ⋯ ⟨0, hK⟩

Instances For

source

theorem Learning.RoundRobin.action_zero {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P 0) :

A 0 =ᵐ[P] fun (x : Ω) => ⟨0, hK⟩

source

theorem Learning.RoundRobin.action_ae_eq_roundRobinNextAction {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (n : ℕ) (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P (n + 1)) :

A (n + 1) =ᵐ[P] fun (x : Ω) => nextAction hK n

source

theorem Learning.RoundRobin.action_ae_eq {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (n : ℕ) (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P n) :

A n =ᵐ[P] fun (x : Ω) => ⟨n % K, ⋯⟩

The action chosen at time n is the action n % K.

source

theorem Learning.RoundRobin.pullCount_mul {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (m : ℕ) (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P (K * m - 1)) (a : Fin K) :

pullCount A a (K * m) =ᵐ[P] fun (x : Ω) => m

At time K * m, the number of times each action is chosen is equal to m.

source

theorem Learning.RoundRobin.pullCount_eq_one {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P (K - 1)) (a : Fin K) :

pullCount A a K =ᵐ[P] fun (x : Ω) => 1

source

theorem Learning.RoundRobin.time_gt_of_pullCount_gt_one {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P (K - 1)) (a : Fin K) :

∀ᵐ (ω : Ω) ∂P, ∀ (n : ℕ), 1 < pullCount A a n ω → K < n

source

theorem Learning.RoundRobin.pullCount_pos_of_time_ge {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P (K - 1)) :

∀ᵐ (ω : Ω) ∂P, ∀ (n : ℕ), K ≤ n → ∀ (b : Fin K), 0 < pullCount A b n ω

source

theorem Learning.RoundRobin.pullCount_pos_of_pullCount_gt_one {K : ℕ} {hK : 0 < K} {ν : ProbabilityTheory.Kernel (Fin K) ℝ} [ProbabilityTheory.IsMarkovKernel ν] {Ω : Type u_1} {mΩ : MeasurableSpace Ω} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A : ℕ → Ω → Fin K} {R : ℕ → Ω → ℝ} [Nonempty (Fin K)] (h : IsAlgEnvSeqUntil A R (roundRobinAlgorithm hK) (stationaryEnv ν) P (K - 1)) (a : Fin K) :

∀ᵐ (ω : Ω) ∂P, ∀ (n : ℕ), 1 < pullCount A a n ω → ∀ (b : Fin K), 0 < pullCount A b n ω

Documentation

LeanMachineLearning.SequentialLearning.Algorithms.RoundRobin

Round-Robin algorithm #

Main definitions #