Documentation

TestingLowerBounds.Divergences.KullbackLeibler.CondKL

Kullback-Leibler divergence #

Main definitions #

Main statements #

theorem ProbabilityTheory.kl_ae_ne_top_iff {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} :
(∀ᵐ (a : α) ∂μ, ProbabilityTheory.kl (κ a) (η a) ) (∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) ∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)

Equivalence between two possible versions of the first condition for the finiteness of the conditional KL divergence, the second version is the preferred one.

theorem ProbabilityTheory.integrable_kl_iff {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h_ac : ∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) :
MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ MeasureTheory.Integrable (fun (a : α) => ∫ (x : β), MeasureTheory.llr (κ a) (η a) xκ a) μ

Equivalence between two possible versions of the second condition for the finiteness of the conditional KL divergence, the first version is the preferred one.

noncomputable def ProbabilityTheory.condKL {α : Type u_1} {mα : MeasurableSpace α} {β : Type u_2} {mβ : MeasurableSpace β} (κ : ProbabilityTheory.Kernel α β) (η : ProbabilityTheory.Kernel α β) (μ : MeasureTheory.Measure α) :

Kullback-Leibler divergence between two kernels κ and η conditional to a measure μ. It is defined as KL(κ, η | μ) := ∫ x, KL(κ x, η x) dμ.

Equations
  • One or more equations did not get rendered due to their size.
Instances For
    theorem ProbabilityTheory.condKL_of_ae_ne_top_of_integrable {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h1 : ∀ᵐ (a : α) ∂μ, ProbabilityTheory.kl (κ a) (η a) ) (h2 : MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ) :
    ProbabilityTheory.condKL κ η μ = (∫ (x : α), (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) xμ)
    theorem ProbabilityTheory.condKL_of_ae_ac_of_ae_integrable_of_integrable {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h_ac : ∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) (h_ae_int : ∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)) (h_int : MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ) :
    ProbabilityTheory.condKL κ η μ = (∫ (x : α), (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) xμ)
    theorem ProbabilityTheory.condKL_of_ae_ac_of_ae_integrable_of_integrable' {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h_ac : ∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) (h_ae_int : ∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)) (h_int : MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ) :
    ProbabilityTheory.condKL κ η μ = (∫ (x : α), (fun (a : α) => ∫ (x : β), MeasureTheory.llr (κ a) (η a) xκ a) xμ)
    @[simp]
    theorem ProbabilityTheory.condKL_of_not_ae_ne_top {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h : ¬∀ᵐ (a : α) ∂μ, ProbabilityTheory.kl (κ a) (η a) ) :
    @[simp]
    theorem ProbabilityTheory.condKL_of_not_ae_ac {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h : ¬∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) :
    @[simp]
    theorem ProbabilityTheory.condKL_of_not_ae_integrable {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h : ¬∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)) :
    @[simp]
    theorem ProbabilityTheory.condKL_of_not_integrable {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h : ¬MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ) :
    @[simp]
    theorem ProbabilityTheory.condKL_of_not_integrable' {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h : ¬MeasureTheory.Integrable (fun (a : α) => ∫ (x : β), MeasureTheory.llr (κ a) (η a) xκ a) μ) :
    theorem ProbabilityTheory.condKL_toReal_of_ae_ac_of_ae_integrable {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} (h_ac : ∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) (h_ae_int : ∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)) :
    (ProbabilityTheory.condKL κ η μ).toReal = ∫ (x : α), (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) xμ
    theorem ProbabilityTheory.condKL_eq_top_iff {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} :
    ProbabilityTheory.condKL κ η μ = (¬∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) (¬∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)) ¬MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ
    theorem ProbabilityTheory.condKL_ne_top_iff {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} :
    ProbabilityTheory.condKL κ η μ (∀ᵐ (a : α) ∂μ, (κ a).AbsolutelyContinuous (η a)) (∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr (κ a) (η a)) (κ a)) MeasureTheory.Integrable (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) μ
    theorem ProbabilityTheory.condKL_ne_top_iff' {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} {η : ProbabilityTheory.Kernel α β} :
    ProbabilityTheory.condKL κ η μ ProbabilityTheory.condKL κ η μ = (∫ (x : α), (fun (a : α) => (ProbabilityTheory.kl (κ a) (η a)).toReal) xμ)
    @[simp]
    @[simp]
    theorem ProbabilityTheory.condKL_zero_right {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {mβ : MeasurableSpace β} {κ : ProbabilityTheory.Kernel α β} (h : ∃ᵐ (a : α) ∂μ, κ a 0) :
    @[simp]
    @[simp]
    theorem ProbabilityTheory.condKL_compProd_meas_eq_top {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {γ : Type u_3} {mβ : MeasurableSpace β} {mγ : MeasurableSpace γ} [MeasurableSpace.CountableOrCountablyGenerated (α × β) γ] [MeasureTheory.SFinite μ] {ξ : ProbabilityTheory.Kernel α β} [ProbabilityTheory.IsSFiniteKernel ξ] {κ : ProbabilityTheory.Kernel (α × β) γ} {η : ProbabilityTheory.Kernel (α × β) γ} [ProbabilityTheory.IsMarkovKernel κ] [ProbabilityTheory.IsMarkovKernel η] :
    ProbabilityTheory.condKL κ η (μ.compProd ξ) = (¬∀ᵐ (a : α) ∂μ, ProbabilityTheory.condKL (κ.snd' a) (η.snd' a) (ξ a) ) ¬MeasureTheory.Integrable (fun (x : α) => (ProbabilityTheory.condKL (κ.snd' x) (η.snd' x) (ξ x)).toReal) μ

    This is to handle the case in condKL_compProd_meas when the lhs is ⊤, in this case the rhs is 'morally' also ⊤, so the equality holds, but actually in Lean the equality is not true, because of how we handle the infinities in the integrals, so we have to make a separate lemma for this case.

    theorem ProbabilityTheory.condKL_compProd_meas {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {γ : Type u_3} {mβ : MeasurableSpace β} {mγ : MeasurableSpace γ} [MeasurableSpace.CountableOrCountablyGenerated (α × β) γ] [MeasureTheory.SFinite μ] {ξ : ProbabilityTheory.Kernel α β} [ProbabilityTheory.IsSFiniteKernel ξ] {κ : ProbabilityTheory.Kernel (α × β) γ} {η : ProbabilityTheory.Kernel (α × β) γ} [ProbabilityTheory.IsMarkovKernel κ] [ProbabilityTheory.IsMarkovKernel η] (h : ProbabilityTheory.condKL κ η (μ.compProd ξ) ) :
    ProbabilityTheory.condKL κ η (μ.compProd ξ) = (∫ (x : α), (ProbabilityTheory.condKL (κ.snd' x) (η.snd' x) (ξ x)).toRealμ)

    The chain rule for the KL divergence.

    theorem ProbabilityTheory.Kernel.integrable_llr_compProd_iff' {α : Type u_1} {mα : MeasurableSpace α} {β : Type u_2} {γ : Type u_3} {mβ : MeasurableSpace β} {mγ : MeasurableSpace γ} [MeasurableSpace.CountableOrCountablyGenerated β γ] {κ₁ : ProbabilityTheory.Kernel α β} {η₁ : ProbabilityTheory.Kernel α β} {κ₂ : ProbabilityTheory.Kernel (α × β) γ} {η₂ : ProbabilityTheory.Kernel (α × β) γ} [ProbabilityTheory.IsFiniteKernel κ₁] [ProbabilityTheory.IsFiniteKernel η₁] [ProbabilityTheory.IsMarkovKernel κ₂] [ProbabilityTheory.IsMarkovKernel η₂] (a : α) (h_ac : ((κ₁.compProd κ₂) a).AbsolutelyContinuous ((η₁.compProd η₂) a)) :
    MeasureTheory.Integrable (MeasureTheory.llr ((κ₁.compProd κ₂) a) ((η₁.compProd η₂) a)) ((κ₁.compProd κ₂) a) MeasureTheory.Integrable (MeasureTheory.llr (κ₁ a) (η₁ a)) (κ₁ a) MeasureTheory.Integrable (fun (b : β) => (ProbabilityTheory.kl (κ₂ (a, b)) (η₂ (a, b))).toReal) (κ₁ a) ∀ᵐ (b : β) ∂κ₁ a, MeasureTheory.Integrable (MeasureTheory.llr (κ₂ (a, b)) (η₂ (a, b))) (κ₂ (a, b))
    theorem ProbabilityTheory.kl_compProd_kernel_of_ae_ac_of_ae_integrable {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {γ : Type u_3} {mβ : MeasurableSpace β} {mγ : MeasurableSpace γ} [MeasurableSpace.CountableOrCountablyGenerated β γ] {κ₁ : ProbabilityTheory.Kernel α β} {η₁ : ProbabilityTheory.Kernel α β} {κ₂ : ProbabilityTheory.Kernel (α × β) γ} {η₂ : ProbabilityTheory.Kernel (α × β) γ} [ProbabilityTheory.IsFiniteKernel κ₁] [ProbabilityTheory.IsFiniteKernel η₁] [ProbabilityTheory.IsMarkovKernel κ₂] [ProbabilityTheory.IsMarkovKernel η₂] (h_ac : ∀ᵐ (a : α) ∂μ, ((κ₁.compProd κ₂) a).AbsolutelyContinuous ((η₁.compProd η₂) a)) (h_ae_int : ∀ᵐ (a : α) ∂μ, MeasureTheory.Integrable (MeasureTheory.llr ((κ₁.compProd κ₂) a) ((η₁.compProd η₂) a)) ((κ₁.compProd κ₂) a)) :
    ∀ᵐ (a : α) ∂μ, (ProbabilityTheory.kl ((κ₁.compProd κ₂) a) ((η₁.compProd η₂) a)).toReal = (ProbabilityTheory.kl (κ₁ a) (η₁ a)).toReal + ∫ (b : β), (ProbabilityTheory.kl (κ₂ (a, b)) (η₂ (a, b))).toRealκ₁ a
    theorem ProbabilityTheory.condKL_compProd_kernel {α : Type u_1} {mα : MeasurableSpace α} {μ : MeasureTheory.Measure α} {β : Type u_2} {γ : Type u_3} {mβ : MeasurableSpace β} {mγ : MeasurableSpace γ} [MeasurableSpace.CountableOrCountablyGenerated (α × β) γ] {κ₁ : ProbabilityTheory.Kernel α β} {η₁ : ProbabilityTheory.Kernel α β} {κ₂ : ProbabilityTheory.Kernel (α × β) γ} {η₂ : ProbabilityTheory.Kernel (α × β) γ} [ProbabilityTheory.IsMarkovKernel κ₁] [ProbabilityTheory.IsMarkovKernel η₁] [ProbabilityTheory.IsMarkovKernel κ₂] [ProbabilityTheory.IsMarkovKernel η₂] [MeasureTheory.SFinite μ] :
    ProbabilityTheory.condKL (κ₁.compProd κ₂) (η₁.compProd η₂) μ = ProbabilityTheory.condKL κ₁ η₁ μ + ProbabilityTheory.condKL κ₂ η₂ (μ.compProd κ₁)
    theorem ProbabilityTheory.MeasurableEquiv.piCongrLeft_apply_apply {ι : Type u_3} {ι' : Type u_4} (e : ι ι') {β : ι'Type u_5} [(i' : ι') → MeasurableSpace (β i')] (x : (i : ι) → β (e i)) (i : ι) :
    (MeasurableEquiv.piCongrLeft (fun (i' : ι') => β i') e) x (e i) = x i
    theorem ProbabilityTheory.Measure.pi_map_piCongrLeft {ι : Type u_3} {ι' : Type u_4} [hι : Fintype ι] [hι' : Fintype ι'] (e : ι ι') {β : ι'Type u_5} [(i : ι') → MeasurableSpace (β i)] (μ : (i : ι') → MeasureTheory.Measure (β i)) [∀ (i : ι'), MeasureTheory.SigmaFinite (μ i)] :
    theorem ProbabilityTheory.kl_pi {ι : Type u_3} [hι : Fintype ι] {β : ιType u_4} [(i : ι) → MeasurableSpace (β i)] [∀ (i : ι), MeasurableSpace.CountablyGenerated (β i)] {μ : (i : ι) → MeasureTheory.Measure (β i)} {ν : (i : ι) → MeasureTheory.Measure (β i)} [∀ (i : ι), MeasureTheory.IsProbabilityMeasure (μ i)] [∀ (i : ι), MeasureTheory.IsProbabilityMeasure (ν i)] :