Lower bounds for hypothesis testing based on information theory

B Radon-Nikodym derivatives and disintegration

Definition 49
#

Let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels, with either \(\mathcal X\) countable or \(\mathcal{Y}\) countably generated. The Radon-Nikodym derivative of \(\kappa \) with respect to \(\eta \), denoted by \(\frac{d \kappa }{d \eta }\), is a measurable function \(\mathcal X \times \mathcal Y \to \mathbb {R}_{+, \infty }\) with \(\kappa = \frac{d \kappa }{d \eta } \cdot \eta + \kappa _{\perp \eta }\), where for all \(x\), \(\kappa _{\perp \eta }(x) \perp \eta (x)\).

Lemma B.0.1

Let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels, with either \(\mathcal X\) countable or \(\mathcal{Y}\) countably generated. If for some \(f\) and \(\xi \), \(\kappa = f \cdot \eta + \xi \) with \(\xi (x) \perp \eta (x)\) for all \(x\), then for all \(x\), \(f(x, y) = \frac{d \kappa (x)}{d \eta (x)}(y)\) for \(\eta (x)\)-almost all \(y \in \mathcal Y\).

Proof

Let \(f\) and \(\xi \) be such that \(\kappa = f \cdot \eta + \xi \) with \(\xi (x) \perp \eta (x)\) for all \(x\). Then for all \(x \in \mathcal X\), \(\kappa (x) = f(x, \cdot ) \cdot \eta (x) + \xi (x)\) with \(\xi (x) \perp \eta (x)\). By the uniqueness result for the Radon-Nikodym derivative of measures, \(f(x, \cdot ) = \frac{d \kappa (x)}{d \eta (x)}\) almost everywhere.

Corollary B.0.2

For all \(x \in \mathcal X\), for \(\eta (x)\)-almost all \(y \in \mathcal Y\), \(\frac{d \kappa }{d \eta }(x, y) = \frac{d \kappa (x)}{d \eta (x)}(y)\).

Proof

Apply Lemma B.0.1.

B.1 Composition-product

Let \(\mu , \nu \) be two measures on \(\mathcal X\) with \(\mu \ll \nu \) and let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels with \(\kappa (x) \ll \eta (x)\) for \(\nu \)-almost all \(x\), with either \(\mathcal X\) countable or \(\mathcal{Y}\) countably generated. Then for \((\nu \otimes \eta )\)-almost all \((x, y)\), \(\frac{d (\mu \otimes \kappa )}{d (\nu \otimes \eta )}(x,y) = \frac{d\mu }{d\nu }(x)\frac{d \kappa }{d \eta }(x,y)\). This implies that the equality is true for \(\nu \)-almost all \(x\), for \(\eta (x)\)-almost all \(y\).

Proof

It is sufficient to prove that the integrals of the two functions on all product sets \(s \times t\) for \(s\) and \(t\) measurable are equal (since these sets are a \(\pi \)-system). The computation of first integral is immediate :

\begin{align*} \int _{p \in s \times t}\frac{d (\mu \otimes \kappa )}{d (\nu \otimes \eta )}(p) \partial (\nu \otimes \eta ) & = (\mu \otimes \kappa )(s \times t) \: . \end{align*}

The second one uses Corollary B.0.2.

\begin{align*} \int _{p \in s \times t}\frac{d\mu }{d\nu }(p_X) \frac{d \kappa }{d \eta }(p) \partial (\nu \otimes \eta ) & = \int _{x \in s} \int _{y \in t} \frac{d\mu }{d\nu }(x) \frac{d \kappa }{d \eta }(x,y) \partial \eta (x) \partial \nu \\ & = \int _{x \in s} \frac{d\mu }{d\nu }(x) \int _{y \in t} \frac{d \kappa (x)}{d \eta (x)}(y) \partial \eta (x) \partial \nu \\ & = \int _{x \in s} \frac{d\mu }{d\nu }(x) \kappa (x)(t) \partial \nu \\ & = \int _{x \in s} \kappa (x)(t) \partial \mu \\ & = (\mu \otimes \kappa )(s \times t) \: . \end{align*}
Lemma B.1.2
#

Let \(\mu , \nu \) be two measures on \(\mathcal X\) and let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels. Let \(\mu ' = \left(\frac{\partial \mu }{\partial \nu }\right) \cdot \nu \) and \(\kappa ' = \left(\frac{\partial \kappa }{\partial \eta }\right) \cdot \eta \). Then for \((\nu \otimes \eta )\)-almost all \(z\), \(\frac{\partial (\mu ' \otimes \kappa ')}{\partial (\nu \otimes \eta )}(z) = \frac{\partial (\mu \otimes \kappa )}{\partial (\nu \otimes \eta )}(z)\).

Proof
Lemma B.1.3
#

Let \(\mu , \nu \) be two measures on \(\mathcal X\) and let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels, with either \(\mathcal X\) countable or \(\mathcal{Y}\) countably generated. Let \(\mu ' = \left(\frac{d \mu }{d \nu }\right) \cdot \nu \) and \(\kappa ' = \left(\frac{d \kappa }{d \eta }\right) \cdot \eta \). Then for \((\nu \otimes \eta )\)-almost all \((x, y)\), \(\frac{d\mu '}{d\nu }(x)\frac{d \kappa '}{d \eta }(x,y) = \frac{d\mu }{d\nu }(x)\frac{d \kappa }{d \eta }(x,y)\).

Proof

Let \(\mu , \nu \in \mathcal M(\mathcal X)\) and let \(\kappa : \mathcal X \rightsquigarrow \mathcal Y\) be a finite kernel. Then for \((\nu \otimes \kappa )\)-almost all \((x, y)\), \(\frac{d (\mu \otimes \kappa )}{d (\nu \otimes \kappa )}(x,y) = \frac{d\mu }{d\nu }(x)\).

Proof

TODO: wlog \(\mu \ll \nu \), and we can consider products of sets.

\begin{align*} \int _{p \in s \times t} \frac{d (\mu \otimes \kappa )}{d (\nu \otimes \kappa )}(p) \partial (\nu \otimes \kappa ) & = (\mu \otimes \kappa ) (s \times t) \: . \end{align*}
\begin{align*} \int _{p \in s \times t} \frac{d \mu }{d \nu }(p_1) \partial (\nu \otimes \kappa ) & = \int _{x \in s} \int _{y \in t} \frac{d \mu }{d \nu }(x) \partial \kappa (x) \partial \nu \\ & = \int _{x \in s} \frac{d \mu }{d \nu }(x) \kappa (x)(t) \partial \nu \\ & = \int _{x \in s} \kappa (x)(t) \partial \mu \\ & = (\mu \otimes \kappa ) (s \times t) \: . \end{align*}

Let \(\mu , \nu \) be two measures on \(\mathcal X\) and let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels, with either \(\mathcal X\) countable or \(\mathcal{Y}\) countably generated. Then for \((\nu \otimes \eta )\)-almost all \((x, y)\), \(\frac{d (\mu \otimes \kappa )}{d (\nu \otimes \eta )}(x,y) = \frac{d\mu }{d\nu }(x)\frac{d \kappa }{d \eta }(x,y)\). This implies that the equality is true for \(\nu \)-almost all \(x\), for \(\eta (x)\)-almost all \(y\).

Proof

Use Lemma B.1.2 for \(\mu ', \kappa '\) (defined in previous lemmas) then B.1.1 and B.1.3.

Corollary B.1.6

Let \(\mu \) be a measures on \(\mathcal X\) and let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be two finite kernels, with either \(\mathcal X\) countable or \(\mathcal{Y}\) countably generated. Then for \((\mu \otimes \eta )\)-almost all \((x, y)\), \(\frac{d (\mu \otimes \kappa )}{d (\mu \otimes \eta )}(x,y) = \frac{d \kappa }{d \eta }(x,y)\).

Proof

B.2 Composition

Lemma B.2.1
#

Let \(\mu , \nu \in \mathcal M(\mathcal X)\) with \(\mu \ll \nu \), \(g : \mathcal X \to \mathcal Y\) a measurable function and denote by \(g^* \mathcal Y\) the comap of the \(\sigma \)-algebra on \(\mathcal Y\) by \(g\). Then \(\nu \)-almost everywhere,

\begin{align*} \frac{d g_*\mu }{d g_*\nu }(g(x)) = \nu \left[ \frac{d \mu }{d \nu } \mid g^* \mathcal Y\right](x) \: . \end{align*}
Proof

We show that the integrals of the two functions agree on all \(g^* \mathcal Y\)-measurable sets. It suffices to show equality on all sets oc \(\mathcal X\) of the form \(g^{-1}(t)\) for \(t\) a measurable set of \(\mathcal Y\). TODO: also show measurability.

\begin{align*} \int _{x \in g^{-1}(t)}\frac{d g_*\mu }{d g_*\nu }(g(x)) \partial \nu & = \int _{y \in t}\frac{d g_*\mu }{d g_*\nu }(y) \partial (g_*\nu ) \\ & = g_*\mu (t) \: . \end{align*}
\begin{align*} \int _{x \in g^{-1}(t)}\nu \left[ \frac{d \mu }{d \nu } \mid g^* \mathcal Y\right](x) \partial \nu & = \int _{x \in g^{-1}(t)}\frac{d \mu }{d \nu }(x) \partial \nu \\ & = \mu (g^{-1}(t)) \\ & = g_*\mu (t) \: . \end{align*}
Lemma B.2.2

Let \(\mu , \nu \in \mathcal M(\mathcal X)\) with \(\mu \ll \nu \) and let \(\kappa , \eta : \mathcal X \rightsquigarrow \mathcal Y\) be finite kernels with \(\kappa (x) \ll \eta (x)\) \(\nu \)-a.e.. Let \(\mathcal B\) be the sigma-algebra on \(\mathcal X \times \mathcal Y\) obtained by taking the comap of the sigma-algebra of \(\mathcal Y\) by the projection. Then for \((\nu \otimes \eta )\)-almost every \((x,y)\),

\begin{align*} \frac{d(\kappa \circ \mu )}{d(\eta \circ \nu )}(y) & = (\nu \otimes \eta )\left[ \frac{d (\mu \otimes \kappa )}{d (\nu \otimes \eta )} \mid \mathcal B \right](x,y) \: . \end{align*}
Proof

Let \(\pi _Y : \mathcal X \times \mathcal Y \to \mathcal Y\) be the projection \(\pi _Y(x,y) = y\). Remark that \(\kappa \circ \mu = \pi _{Y*}(\mu \otimes \kappa )\) and similarly for \(\nu , \eta \). By Lemma B.2.1, \((\nu \otimes \eta )\)-almost everywhere,

\begin{align*} \frac{d(\kappa \circ \mu )}{d(\eta \circ \nu )}(y) & = \frac{d \pi _{Y*}(\mu \otimes \kappa )}{d \pi _{Y*}(\nu \otimes \eta )}(\pi _Y((x,y))) \\ & = (\nu \otimes \eta )\left[ \frac{d (\mu \otimes \kappa )}{d (\nu \otimes \eta )} \mid \mathcal B\right](x,y) \: . \end{align*}
Lemma B.2.3 [ Csi63 ]

Let \(\mu , \nu \in \mathcal M(\mathcal X)\) with \(\mu \ll \nu \) and let \(\kappa : \mathcal X \rightsquigarrow \mathcal Y\) be a finite kernel. Let \(\mathcal B\) be the sigma-algebra on \(\mathcal X \times \mathcal Y\) obtained by taking the comap of the sigma-algebra of \(\mathcal Y\) by the projection. Then for \((\nu \otimes \kappa )\)-almost every \((x,y)\),

\begin{align*} \frac{d(\kappa \circ \mu )}{d(\kappa \circ \nu )}(y) & = (\nu \otimes \kappa )\left[ (x, y) \mapsto \frac{d \mu }{d \nu }(x) \mid \mathcal B \right](x,y) \: . \end{align*}
Proof

Apply Lemma B.2.2 to get, \((\nu \otimes \kappa )\)-almost everywhere,

\begin{align*} \frac{d(\kappa \circ \mu )}{d(\kappa \circ \nu )}(y) & = (\nu \otimes \kappa )\left[ \frac{d (\mu \otimes \kappa )}{d (\nu \otimes \kappa )} \mid \mathcal B\right](x,y) \: . \end{align*}

Finally, by Corollary B.1.4, \(\frac{d (\mu \otimes \kappa )}{d (\nu \otimes \kappa )} = \frac{d \mu }{d \nu }\) a.e.

B.3 Sigma-algebras

For \(\mathcal A\) a sub-\(\sigma \)-algebra and \(\mu \) a measure, we write \(\mathcal\mu _{| \mathcal A}\) for the measure restricted to the \(\sigma \)-algebra.

Lemma B.3.1

Let \(\mu , \nu \) be two finite measures on \(\mathcal X\) with \(\mu \ll \nu \) and let \(\mathcal A\) be a sub-\(\sigma \)-algebra of \(\mathcal X\). Then \(\frac{d \mu _{| \mathcal A}}{d \nu _{| \mathcal A}}\) is \(\nu _{| \mathcal A}\)-almost everywhere (hence also \(\nu \)-a.e.) equal to \(\nu \left[ \frac{d \mu }{d \nu } \mid \mathcal A\right]\).

Proof

TODO: the current Lean proof is not the proof presented here (but a direct computation).

The restriction \(\mu _{| \mathcal A}\) is the map of \(\mu \) by the identity, seen as a function from \(\mathcal X\) with its \(\sigma \)-algebra to \(\mathcal X\) with \(\mathcal{A}\). We can thus apply Lemma B.2.1.

Lemma B.3.2

Let \(\mu , \nu \in \mathcal M(\mathcal X)\) with \(\mu \ll \nu \), \(g : \mathcal X \to \mathcal Y\) a measurable function and denote by \(g^* \mathcal Y\) the comap of the \(\sigma \)-algebra on \(\mathcal Y\) by \(g\). Then \(\nu \)-almost everywhere,

\begin{align*} \frac{d g_*\mu }{d g_*\nu }(g(x)) = \frac{d \mu _{| g^* \mathcal Y}}{d \nu _{| g^* \mathcal Y}}(x) \: . \end{align*}
Proof

Combine Lemma B.2.1 and Lemma B.3.1.