BIOSTAT 830 Problem Set 1 Solutions

Zhenke Wu

zhenkewu@umich.edu

2016-11-27

Theory

Exercise 2.5, Koller and Friedman (2009)

Let \(\mathbf{X}, \mathbf{Y}, \mathbf{Z}\) be three disjoint subsets of variables such that \(\mathcal{X}=\mathbf{X}\cup \mathbf{Y}\cup \mathbf{Z}\). Prove that \(P\) satisfies \(\mathbf{X}\perp \mathbf{Y}\mid \mathbf{Z}\) if and only if we can write \(P\) in the form: \[P(\mathcal{X})=\phi_1(\mathbf{X},\mathbf{Z})\phi_2(\mathbf{Y}, \mathbf{Z}).\]

[Proof] 1. Sufficiency.

\[ \begin{aligned} P(\mathbf{X},\mathbf{Y}\mid \mathbf{Z}) & = \frac{P(\mathcal{X})}{P(\mathbf{Z})}=\frac{P(\mathcal{X})}{\int P(\mathbf{x},\mathbf{y},\mathbf{Z})d\mathbf{x}d\mathbf{y}} \leftarrow (\text{by definition of conditional density})\\ & = \frac{P(\mathcal{X})}{\int \phi_1(\mathbf{x},\mathbf{Z})d\mathbf{x}\int \phi_2(\mathbf{y},\mathbf{Z})d\mathbf{y}}\leftarrow (\text{by the factorization assumption})\\ & = \frac{\phi_1(\mathbf{X},\mathbf{Z})}{\int \phi_1(\mathbf{x},\mathbf{Z})d\mathbf{x}}\cdot \frac{\phi_2(\mathbf{Y}, \mathbf{Z})}{\int \phi_2(\mathbf{y},\mathbf{Z})d\mathbf{y}}, \end{aligned} \] which means the conditional density can be factorized, hence \(\mathbf{X} \perp \mathbf{Y}\mid \mathbf{Z}\).

  1. Necessity.

\[ \begin{aligned} P(\mathcal{X}) & = P(\mathbf{X}\mid \mathbf{Y},\mathbf{Z})P(\mathbf{Y},\mathbf{Z})\\ & = P(\mathbf{X}\mid \mathbf{Z})P(\mathbf{Y},\mathbf{Z}), \end{aligned} \] where the second equation holds because of the conditional independence. We then define \(\phi_1(\mathbf{X},\mathbf{Z})\) and \(\phi_2(\mathbf{Y},\mathbf{Z})\) to be the first and second terms on the right hand side, respectively.

Exercise 2.9, Koller and Friedman (2009)

Prove or disprove (by providing a counter example) each of the following properties of independence.

a) \(X\perp Y,W\mid Z\implies X\perp Y\mid Z\)

Correct.

b) \((X\perp Y\mid Z) ~and~ (X,Y\perp W\mid Z) \implies X\perp W\mid Z\)

Correct.

c) \((X\perp Y, W\mid Z) ~and~ (Y\perp W\mid Z) \implies X,W\perp Y\mid Z\)

Correct.

d) \((X\perp Y\mid Z) ~and~(X\perp Y\mid W) \implies X\perp Y\mid Z,W\)

Incorrect. Consider a DAG \(X\rightarrow Z \leftarrow U \rightarrow W \leftarrow Y\), where both \(Z\) and \(W\) are colliders with incoming arrows. Given exactly one of the two colliders will leave the path between \(X\) and \(Y\) d-separated. However, given both colliders, \(X\) and \(Y\) will no longer be d-separated.

Exercise 3.13, Koller and Friedman (2009)

Let \(\mathcal{B} = (\mathcal{G},P)\) be a Bayesian network over \(\mathcal{X}\). The Bayesian network is parameterized by a set of CPD parameters of the form \(\theta_{x\mid \mathbf{u}}\) for \(X \in \mathcal{X}\), \(\mathbf{U}=Pa_X^\mathcal{G}\), \(x \in Val(X)\), \(u \in Val(\mathbf{U})\). Consider any conditional independence statement of the form (\(\mathbf{X}\perp \mathbf{Y}\mid \mathbf{Z}\)). Show how this statement translates into a set of polynomial equalities over the set of CPD parameters \(\theta_{x\mid \mathbf{u}}\). (Note: A polynomial equality is an assertion of the form \(aθ_1^2 +bθ_1θ_2 +cθ_2^3 +d = 0\).)

[Proof] We consider discrete distributions in this problem. To specify a saturated Bayesian network, the parameters \(\theta_{x\mid \mathbf{u}}\) parameterizes child-parent conditional distributions \(P(X\mid \mathbf{U})\) for all possible values of \(X=x\) and \(\mathbf{U}=\mathbf{u}\), where \(\mathbf{U}=Pa_X^{\mathcal{G}}\). Further conditional independence constraints on the joint distribution amount to constraints on the \(\theta\) parameters, which we need to prove are in polynomial equality forms. Because \(P(\mathbf{X},\mathbf{Y},\mathbf{Z})=\sum_{V\notin \mathbf{X}\cup\mathbf{Y}\cup\mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V\mid Pa_{V}^{\mathcal{G}}}\), we can obtain distributions not necessarily for child and parents, for example, \(P(\mathbf{Z})=\sum_{V\notin \mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V \mid Pa_{V}^{\mathcal{G}}}\) and \(P(\mathbf{X},\mathbf{Z})=\sum_{V\notin \mathbf{X}\cup\mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V\mid Pa_{V}^{\mathcal{G}}}\); similar for other joint distributions over a subset of nodes in \(\mathcal{B}\). We have \(P(\mathbf{X},\mathbf{Y}\mid \mathbf{Z}=\mathbf{z}) = P(\mathbf{X}\mid \mathbf{Z}=\mathbf{z}) P(\mathbf{Y}\mid \mathbf{Z}=\mathbf{z})\) being equivalent to

\[ \begin{aligned} \sum_{V\notin \mathbf{X}\cup\mathbf{Y}\cup\mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V\mid Pa_{V}^{\mathcal{G}}} \cdot \sum_{V\notin \mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V \mid Pa_{V}^{\mathcal{G}}} &= \sum_{V\notin \mathbf{X}\cup\mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V\mid Pa_{V}^{\mathcal{G}}} \cdot \sum_{V\notin \mathbf{Y}\cup\mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V\mid Pa_{V}^{\mathcal{G}}}, \end{aligned} \] a polynomial equality for \(\mathbf{z}\in \{\mathbf{z}: \sum_{V\notin \mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V \mid Pa_{V}^{\mathcal{G}}}\neq 0\}\); If \(P(\mathbf{Z}=\mathbf{z})=0\), we have \(\sum_{V\notin \mathbf{Z}}\prod_{V\in\mathcal{X}}\theta_{V \mid Pa_{V}^{\mathcal{G}}}\mid_{\mathbf{Z}=\mathbf{z}}= 0\).

Examples and Implementations

Check out one possible set of solutions here.