Algorithms and environments #

We define structures for stochastic, sequential algorithms and environments, and the notion of an algorithm-environment sequence, which is a sequence of actions and rewards generated by an algorithm interacting with an environment.

Main definitions #

Algorithm α R: a stochastic, sequential algorithm.
Environment α R: a stochastic environment.
IsAlgEnvSeq A R' alg env P: an algorithm-environment sequence. That is, a sequence of actions A and feedback R' that have the correct conditional distributions to be generated by an algorithm alg interacting with an environment env, defined on a probability space (Ω, P).
IsAlgEnvSeqUntil A R' alg env P N: A and R' form an algorithm-environment sequence until time N.

Main statements #

isAlgEnvSeq_unique: the law of the sequence of actions and observations generated by an algorithm-environment pair is unique: it does not depend on the probability space used. If A₁, R₁ and A₂, R₂ are two algorithm-environment sequences generated by the same algorithm-environment pair on probability spaces (Ω, P) and (Ω', P'), then P.map (fun ω n ↦ (A₁ n ω, R₁ n ω)) = P'.map (fun ω n ↦ (A₂ n ω, R₂ n ω)).

Notes #

The ANCHOR comments are used to mark code that appears in the tutorials.

source

structure Learning.Algorithm (α : Type u_4) (R : Type u_5) [MeasurableSpace α] [MeasurableSpace R] :

Type (max u_4 u_5)

A stochastic, sequential algorithm.

policy (n : ℕ) : ProbabilityTheory.Kernel (↥(Finset.Iic n) → α × R) α
Policy or sampling rule: distribution of the next action.
h_policy (n : ℕ) : ProbabilityTheory.IsMarkovKernel (self.policy n)
p0 : MeasureTheory.Measure α
Distribution of the first action.
hp0 : MeasureTheory.IsProbabilityMeasure self.p0

Instances For

source

instance Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdPolicy {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) (n : ℕ) :

ProbabilityTheory.IsMarkovKernel (alg.policy n)

source

instance Learning.instIsProbabilityMeasureP0 {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) :

MeasureTheory.IsProbabilityMeasure alg.p0

source

structure Learning.Environment (α : Type u_4) (R : Type u_5) [MeasurableSpace α] [MeasurableSpace R] :

Type (max u_4 u_5)

A stochastic environment.

feedback (n : ℕ) : ProbabilityTheory.Kernel ((↥(Finset.Iic n) → α × R) × α) R
Distribution of the next observation as function of the past history.
h_feedback (n : ℕ) : ProbabilityTheory.IsMarkovKernel (self.feedback n)
ν0 : ProbabilityTheory.Kernel α R
Distribution of the first observation given the first action.
hp0 : ProbabilityTheory.IsMarkovKernel self.ν0

Instances For

source

instance Learning.instIsMarkovKernelProdForallSubtypeNatMemFinsetIicFeedback {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (env : Environment α R) (n : ℕ) :

ProbabilityTheory.IsMarkovKernel (env.feedback n)

source

instance Learning.instIsMarkovKernelν0 {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (env : Environment α R) :

ProbabilityTheory.IsMarkovKernel env.ν0

source

noncomputable def Learning.stepKernel {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) (env : Environment α R) (n : ℕ) :

ProbabilityTheory.Kernel (↥(Finset.Iic n) → α × R) (α × R)

Kernel describing the distribution of the next action-reward pair given the history up to n.

Equations

Learning.stepKernel alg env n = (alg.policy n).compProd (env.feedback n)

Instances For

source

@[implicit_reducible]

instance Learning.instIsMarkovKernelForallSubtypeNatMemFinsetIicProdStepKernel {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) (env : Environment α R) (n : ℕ) :

ProbabilityTheory.IsMarkovKernel (stepKernel alg env n)

Equations

⋯ = ⋯

source

@[simp]

theorem Learning.fst_stepKernel {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) (env : Environment α R) (n : ℕ) :

(stepKernel alg env n).fst = alg.policy n

source

def Learning.IsAlgEnvSeq.step {α : Type u_1} {R : Type u_2} {Ω : Type u_3} (A : ℕ → Ω → α) (R' : ℕ → Ω → R) (n : ℕ) (ω : Ω) :

α × R

Step of the algorithm-environment sequence: the action-reward pair at time n.

Equations

Learning.IsAlgEnvSeq.step A R' n ω = (A n ω, R' n ω)

Instances For

source

theorem Learning.IsAlgEnvSeq.measurable_step {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (n : ℕ) (hA : Measurable (A n)) (hR' : Measurable (R' n)) :

Measurable (step A R' n)

source

def Learning.IsAlgEnvSeq.hist {α : Type u_1} {R : Type u_2} {Ω : Type u_3} (A : ℕ → Ω → α) (R' : ℕ → Ω → R) (n : ℕ) (ω : Ω) :

↥(Finset.Iic n) → α × R

History of the algorithm-environment sequence up to time n.

Equations

Learning.IsAlgEnvSeq.hist A R' n ω i = (A (↑i) ω, R' (↑i) ω)

Instances For

source

theorem Learning.IsAlgEnvSeq.measurable_hist {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) (n : ℕ) :

Measurable (hist A R' n)

source

theorem Learning.IsAlgEnvSeq.eval_comp_hist {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (n : ℕ) :

(fun (x : ↥(Finset.Iic n) → α × R) => x ⟨n, ⋯⟩) ∘ hist A R' n = step A R' n

source

theorem Learning.IsAlgEnvSeq.fst_eval_comp_hist {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (n : ℕ) :

(fun (x : ↥(Finset.Iic n) → α × R) => (x ⟨n, ⋯⟩).1) ∘ hist A R' n = A n

source

theorem Learning.IsAlgEnvSeq.snd_eval_comp_hist {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (n : ℕ) :

(fun (x : ↥(Finset.Iic n) → α × R) => (x ⟨n, ⋯⟩).2) ∘ hist A R' n = R' n

source

structure Learning.IsAlgEnvSeq {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (A : ℕ → Ω → α) (R' : ℕ → Ω → R) (alg : Algorithm α R) (env : Environment α R) (P : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure P] :

Prop

An algorithm-environment sequence: a sequence of actions and rewards generated by an algorithm interacting with an environment.

measurable_A (n : ℕ) : Measurable (A n)
measurable_R (n : ℕ) : Measurable (R' n)
hasLaw_action_zero : ProbabilityTheory.HasLaw (fun (ω : Ω) => A 0 ω) alg.p0 P
hasCondDistrib_reward_zero : ProbabilityTheory.HasCondDistrib (R' 0) (A 0) env.ν0 P
hasCondDistrib_action (n : ℕ) : ProbabilityTheory.HasCondDistrib (A (n + 1)) (hist A R' n) (alg.policy n) P
hasCondDistrib_reward (n : ℕ) : ProbabilityTheory.HasCondDistrib (R' (n + 1)) (fun (ω : Ω) => (hist A R' n ω, A (n + 1) ω)) (env.feedback n) P

Instances For

source

structure Learning.IsAlgEnvSeqUntil {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (A : ℕ → Ω → α) (R' : ℕ → Ω → R) (alg : Algorithm α R) (env : Environment α R) (P : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure P] (N : ℕ) :

Prop

An algorithm-environment sequence: a sequence of actions and rewards generated by an algorithm interacting with an environment.

measurable_A (n : ℕ) : Measurable (A n)
measurable_R (n : ℕ) : Measurable (R' n)
hasLaw_action_zero : ProbabilityTheory.HasLaw (fun (ω : Ω) => A 0 ω) alg.p0 P
hasCondDistrib_reward_zero : ProbabilityTheory.HasCondDistrib (R' 0) (A 0) env.ν0 P
hasCondDistrib_action (n : ℕ) (hn : n < N) : ProbabilityTheory.HasCondDistrib (A (n + 1)) (IsAlgEnvSeq.hist A R' n) (alg.policy n) P
hasCondDistrib_reward (n : ℕ) (hn : n < N) : ProbabilityTheory.HasCondDistrib (R' (n + 1)) (fun (ω : Ω) => (IsAlgEnvSeq.hist A R' n ω, A (n + 1) ω)) (env.feedback n) P

Instances For

source

theorem Learning.IsAlgEnvSeqUntil.mono {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {N : ℕ} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (h : IsAlgEnvSeqUntil A R' alg env P N) {N' : ℕ} (hN : N' ≤ N) :

IsAlgEnvSeqUntil A R' alg env P N'

source

theorem Learning.IsAlgEnvSeq.isAlgEnvSeqUntil {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (h : IsAlgEnvSeq A R' alg env P) (N : ℕ) :

IsAlgEnvSeqUntil A R' alg env P N

source

theorem Learning.IsAlgEnvSeq.hasLaw_step_zero {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (h : IsAlgEnvSeq A R' alg env P) :

ProbabilityTheory.HasLaw (step A R' 0) (alg.p0.compProd env.ν0) P

source

theorem Learning.IsAlgEnvSeqUntil.hasLaw_step_zero {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {N : ℕ} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (h : IsAlgEnvSeqUntil A R' alg env P N) :

ProbabilityTheory.HasLaw (IsAlgEnvSeq.step A R' 0) (alg.p0.compProd env.ν0) P

source

theorem Learning.IsAlgEnvSeq.hasCondDistrib_step {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (h : IsAlgEnvSeq A R' alg env P) (n : ℕ) :

ProbabilityTheory.HasCondDistrib (step A R' (n + 1)) (hist A R' n) (stepKernel alg env n) P

source

theorem Learning.IsAlgEnvSeqUntil.hasCondDistrib_step {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure P] {N : ℕ} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] (h : IsAlgEnvSeqUntil A R' alg env P N) (n : ℕ) (hn : n < N) :

ProbabilityTheory.HasCondDistrib (IsAlgEnvSeq.step A R' (n + 1)) (IsAlgEnvSeq.hist A R' n) (stepKernel alg env n) P

source

def Learning.IsAlgEnvSeq.filtration {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) :

MeasureTheory.Filtration ℕ mΩ

Filtration generated by the history up to time n.

Equations

Learning.IsAlgEnvSeq.filtration hA hR' = { seq := fun (i : ℕ) => MeasurableSpace.comap (Learning.IsAlgEnvSeq.hist A R' i) inferInstance, mono' := ⋯, le' := ⋯ }

Instances For

source

theorem Learning.IsAlgEnvSeq.adapted_hist {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) :

MeasureTheory.Adapted (filtration hA hR') (hist A R')

source

theorem Learning.IsAlgEnvSeq.adapted_step {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) :

MeasureTheory.Adapted (filtration hA hR') (step A R')

source

theorem Learning.IsAlgEnvSeq.adapted_action {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) :

MeasureTheory.Adapted (filtration hA hR') A

source

theorem Learning.IsAlgEnvSeq.adapted_reward {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) :

MeasureTheory.Adapted (filtration hA hR') R'

source

def Learning.IsAlgEnvSeq.filtrationAction {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} (hA : ∀ (n : ℕ), Measurable (A n)) (hR' : ∀ (n : ℕ), Measurable (R' n)) :

MeasureTheory.Filtration ℕ mΩ

Filtration generated by the history at time n-1 together with the action at time n.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

theorem Learning.IsAlgEnvSeq.filtrationAction_zero_eq_comap {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {hA : ∀ (n : ℕ), Measurable (A n)} {hR' : ∀ (n : ℕ), Measurable (R' n)} :

↑(filtrationAction hA hR') 0 = MeasurableSpace.comap (A 0) inferInstance

source

theorem Learning.IsAlgEnvSeq.filtrationAction_eq_comap {α : Type u_1} {R : Type u_2} {Ω : Type u_3} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {mΩ : MeasurableSpace Ω} {A : ℕ → Ω → α} {R' : ℕ → Ω → R} {hA : ∀ (n : ℕ), Measurable (A n)} {hR' : ∀ (n : ℕ), Measurable (R' n)} (n : ℕ) (hn : n ≠ 0) :

↑(filtrationAction hA hR') n = MeasurableSpace.comap (fun (ω : Ω) => (hist A R' (n - 1) ω, A n ω)) inferInstance

source

noncomputable def Learning.trajMeasure {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) (env : Environment α R) :

MeasureTheory.Measure (ℕ → α × R)

Measure on the sequence of actions and observations generated by the algorithm/environment.

Equations

Learning.trajMeasure alg env = ProbabilityTheory.Kernel.trajMeasure (alg.p0.compProd env.ν0) (Learning.stepKernel alg env)

Instances For

source

@[implicit_reducible]

instance Learning.instIsProbabilityMeasureForallNatProdTrajMeasure {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} (alg : Algorithm α R) (env : Environment α R) :

MeasureTheory.IsProbabilityMeasure (trajMeasure alg env)

Equations

⋯ = ⋯

source

theorem Learning.eq_trajMeasure_of_isAlgEnvSeq {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {Ω : Type u_4} {mΩ : MeasurableSpace Ω} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A₁ : ℕ → Ω → α} {R₁ : ℕ → Ω → R} (h : IsAlgEnvSeq A₁ R₁ alg env P) :

MeasureTheory.Measure.map (fun (ω : Ω) (n : ℕ) => (A₁ n ω, R₁ n ω)) P = trajMeasure alg env

source

theorem Learning.eq_trajMeasure_map_frestrictLe_of_isAlgEnvSeqUntil {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {Ω : Type u_4} {mΩ : MeasurableSpace Ω} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {A₁ : ℕ → Ω → α} {R₁ : ℕ → Ω → R} {N : ℕ} (h : IsAlgEnvSeqUntil A₁ R₁ alg env P N) :

MeasureTheory.Measure.map (fun (ω : Ω) (n : ↥(Finset.Iic N)) => (A₁ (↑n) ω, R₁ (↑n) ω)) P = MeasureTheory.Measure.map (Preorder.frestrictLe N) (trajMeasure alg env)

source

theorem Learning.isAlgEnvSeq_unique {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {Ω : Type u_4} {Ω' : Type u_5} {mΩ : MeasurableSpace Ω} {mΩ' : MeasurableSpace Ω'} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {P' : MeasureTheory.Measure Ω'} [MeasureTheory.IsProbabilityMeasure P'] {A₁ : ℕ → Ω → α} {R₁ : ℕ → Ω → R} {A₂ : ℕ → Ω' → α} {R₂ : ℕ → Ω' → R} (h1 : IsAlgEnvSeq A₁ R₁ alg env P) (h2 : IsAlgEnvSeq A₂ R₂ alg env P') :

MeasureTheory.Measure.map (fun (ω : Ω) (n : ℕ) => (A₁ n ω, R₁ n ω)) P = MeasureTheory.Measure.map (fun (ω : Ω') (n : ℕ) => (A₂ n ω, R₂ n ω)) P'

The law of the sequence of actions and observations generated by an algorithm-environment pair is unique: it does not depend on the probability space used.

source

theorem Learning.isAlgEnvSeqUntil_unique {α : Type u_1} {R : Type u_2} {mα : MeasurableSpace α} {mR : MeasurableSpace R} {Ω : Type u_4} {Ω' : Type u_5} {mΩ : MeasurableSpace Ω} {mΩ' : MeasurableSpace Ω'} [StandardBorelSpace α] [Nonempty α] [StandardBorelSpace R] [Nonempty R] {alg : Algorithm α R} {env : Environment α R} {P : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure P] {P' : MeasureTheory.Measure Ω'} [MeasureTheory.IsProbabilityMeasure P'] {A₁ : ℕ → Ω → α} {R₁ : ℕ → Ω → R} {A₂ : ℕ → Ω' → α} {R₂ : ℕ → Ω' → R} {N : ℕ} (h1 : IsAlgEnvSeqUntil A₁ R₁ alg env P N) (h2 : IsAlgEnvSeqUntil A₂ R₂ alg env P' N) :

MeasureTheory.Measure.map (fun (ω : Ω) (n : ↥(Finset.Iic N)) => (A₁ (↑n) ω, R₁ (↑n) ω)) P = MeasureTheory.Measure.map (fun (ω : Ω') (n : ↥(Finset.Iic N)) => (A₂ (↑n) ω, R₂ (↑n) ω)) P'

Documentation

LeanMachineLearning.SequentialLearning.Algorithm

Algorithms and environments #

Main definitions #

Main statements #

Notes #