# Notes on The Consistency of Inverse Probability Weighting

#### 2016-11-22

Goal: Prove that under conditional exchangeability, the inverse probability weighted estimate is unbiased for the population average treatment effect.

Let $$(Y_i(z),Z_i, {X}_i)$$ be the vector of potential outcomes under treatment ($$z=1$$) or control ($$z=0$$), the actual observed binary treatment assignment ($$1$$ or $$0$$) and the vector of measured covariates $$X_i$$. Let $$Y^{obs}_i=Y_i(1)Z_i+Y_i(0)(1-Z_i)$$ denote the observed outcome for individual $$i$$. Assume conditional exchangeability holds, i.e., $$Y_i(z)\perp Z_i \mid X_i,\forall z$$1 Because this means $$P(Y_i(z)=1\mid Z_i=z', {X}_i)$$ are identical for any $$z'$$, regardless of the actual assignment received, individuals with identical $${X}_i$$ share the same distribution for $$Y_i(z)\mid {X_i}$$, let’s prove that $E\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}Y_i^{obs}\right\} \Big/ E\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}\right\}=E[Y_i(z)],$ where $$I\{A\}$$ equals one/zero if the statement A is true/false and $$f(Z_i\mid X_i)$$ denotes the conditional probability density function and could be regarded as a random variable2 Not $$P(Z_i\mid X_i)$$ because we would have to write $$P(Z_i=Z_i\mid X_i=X_i)$$ when considering random $$Z_i$$ and $$X_i$$, an awkward notation..

[Proof]

\begin{aligned} \text{Numerator of Left-Hand Side} & = E\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}Y_i^{obs}\right\} \leftarrow (\text{by iterated expectations})\\ & = E\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}[Y_i(1)Z_i+Y_i(0)(1-Z_i)]\right\} \leftarrow (\text{by definition of observed values})\\ & = E\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}Y_i(z)\right\} \leftarrow (\text{by definition of observed values})\\ & = E_{X_i}E_{Y_i(z), Z_i\mid X_i}\left[\frac{I(Z_i=z)}{f(Z_i\mid X_i)}Y_i(z)\mid X_i\right] \leftarrow (\text{by iterated expectations})\\ & = E_{X_i}\left\{E_{ Z_i\mid X_i}\left[\frac{I(Z_i=z)}{f(Z_i\mid X_i)}\mid X_i\right]E_{Y_i(z)\mid X_i}\left[Y_i(z)\mid X_i\right] \right\}\leftarrow (\text{by conditional exchangeability:}~Y_i(z)\perp Z_i\mid X_i)\\ & = E_{X_i}\left\{1\cdot E_{Y_i(z)\mid X_i}\left[Y_i(z)\mid X_i\right] \right\} \\ &=E[Y_i(z)] \end{aligned}

\begin{aligned} \text{Denominator of Left-Hand Side} & = E_{X_i}\left[E_{Z_i\mid X_i}\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}\mid X_i\right\}\right] \leftarrow (\text{by iterated expectations})\\ &= 1, \end{aligned} which finishes the proof.

# Remarks

• Even if the denominator seems redundant for its unit expectation, in practice, sample estimates of the mean inverse probability weight will not be exactly $$1$$. $$\hat{E}\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}Y_i^{obs}\right\} \Big/ \hat{E}\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}\right\}$$ is generally used for estimating parameters in a marginal structure models where weighted regressions are commonly used.3 We call it modified Horovitz-Thompson estimator. Compare it with the original Horovitz-Thompson estimator: $$\hat{E}\left\{\frac{I(Z_i=z)}{f(Z_i\mid X_i)}Y_i^{obs}\right\}$$

• Because $$f(Z_i\mid X_i)$$ is regarded as a random variable, the expression on the left is well defined. But when $$f(Z_i=z\mid X_i)=0$$ for some $$z$$, the left-hand side will not have meaningful causal interpretation.

• The weight $$1/f(Z_i\mid X_i)$$ used here is referred to as unstablized weights. A more complete account of the inverse probability weighting with stablized weights $$f(Z_i)/f(Z_i\mid X_i)$$ that might offer smaller variance of the causal effect estimates can be found here (Hernán and Robins 2006Hernán, Miguel A, and James M Robins. 2006. “Estimating Causal Effects from Epidemiological Data.” Journal of Epidemiology and Community Health 60 (7). BMJ Publishing Group Ltd: 578–86.).