Outcome regression

In this post, we consider the task of estimating treatment effects. This is the basic problem in causal inference, and it arises in a many areas of science and engineering. As a running example, we consider the task of estimating the efficacy of a vaccine booster. We begin by mathematically defining treatment effects using the potential outcomes framework.

To keep things simple, we focus on estimating the effect of a binary treatment (e.g. booster vs no booster). We define two potential outcomes $Y_{i} (1)$ and $Y_{i} (0)$ for each subject in the study. In the running example, $Y_{i} (1)$ is the viral load in the $i$ -subject if the subject got the booster, and $Y_{i} (0)$ is the viral load if the subject did not get the booster. The effect of the treatment on the $i$ -th subject is

Δ_{i} ≜ Y_{i} (1) - Y_{i} (0) .

The fundamental challenge in causal inference is only one treatment can be assigned to a subject, so only one of $Y_{i} (1)$ and $Y_{i} (0)$ can be observed. Thus $Δ_{i}$ is never observed. Nevertheless, it is possible (as we shall see) to estimate the average treatment effect (ATE)

$τ ≜ E [Δ_{i}] = E [Y_{i} (1)] - E [Y_{i} (0)]$ by performing randomized experiments.

In a randomized experiment, we randomly assign treatments to the subjects and record the outcomes. Let $W_{i} \in {0, 1}$ and $Y_{i}$ be the treatment assignment and observed outcome of the $i$ -th subject. In the running example, $W_{i}$ indicates whether the $i$ -th subject got the booster and $Y_{i}$ is the (observed) viral load in the $i$ -th subject. Mathematically, in a randomized experiment, we have

\begin{aligned} Y_{i} = Y_{i} (W_{i}) & (SUTVA), \\ (Y_{i} (1), Y_{i} (0)) ⊥ ⊥ W_{i} & (random treatment assignment). \end{aligned}

The first condition (SUTVA) relates the observed outcomes to the potential outcomes: the observed outcome of the $i$ -subject $Y_{i}$ is $Y_{i} (1)$ (resp $Y_{i} (0)$ ) if $W_{i} = 1$ (resp $W_{i} = 0$ ). The second condition says treatments are assigned in a way that does not depend on the potential outcomes. It implies the distribution of potential outcomes in the treated and untreated groups are identical:

(Y_{i} (1), Y_{i} (0)) ∣ {W_{i} = 1} \overset{d}{=} (Y_{i} (1), Y_{i} (0)) ∣ {W_{i} = 0} .

In practice, treatments are often assigned randomly (e.g. by flipping a coin) to satisfy this condition.

Difference-in-means

A simple estimate of the ATE in a randomized experiment is the difference between the (sample) mean outcomes in treated and untreated subjects:

{\hat{τ}}_{DM} = \frac{1}{n_{1}} \sum_{i = 1}^{n} Y_{i} 1 {W_{i} = 1} - \frac{1}{n_{0}} \sum_{i = 1}^{n} Y_{i} 1 {W_{i} = 0},

where $n_{w} ≜ \sum_{i = 1}^{n} 1 {W_{i} = w}$ is the number of subjects assigned treatment $w\in{0,1}$. This is called the difference-in-means estimator, and it is motivated by the observation that the (sample) mean outcome in a treatment group is an unbiased estimate of the expected potential outcome in a randomized experiment:

\begin{aligned} E [\frac{1}{n_{w}} \sum_{i = 1}^{n} Y_{i} 1 {W_{i} = w}] \\ = E [Y_{i} ∣ W_{i} = w] \\ = E [Y_{i} (w) ∣ W_{i} = w] & (SUTVA) \\ = E [Y_{i} (w)] & (random treatment assignment) . \end{aligned}

In light of this observation, it is not hard to see that the difference-in-means estimator is unbiased:

\begin{aligned} E [{\hat{τ}}_{DM}] & = E [\frac{1}{n_{1}} \sum_{i = 1}^{n} Y_{i} 1 {W_{i} = 1}] - E [\frac{1}{n_{0}} \sum_{i = 1}^{n} Y_{i} 1 {W_{i} = 0}] \\ = E [Y_{i} (1)] - E [Y_{i} (0)] \\ = τ . \end{aligned}

We leave as an exercise to show that ${\hat{τ}}_{DM}$ is asymptotically normal:

\sqrt{n_{1} + n_{0}} ({\hat{τ}}_{DM} - τ) \overset{d}{\to} N (0, \frac{σ_{1}^{2}}{π_{1}} + \frac{σ_{0}^{2}}{π_{0}}),

where $σ_{w}^{2} ≜ v a r [Y_{i} (w)]$ and $π_{w} ≜ P {W_{i} = w}$ for $w \in {0, 1}$ . This result allows us to form confidence intervals and test hypothesis regarding the ATE.

Posted on December 11, 2021 from Ann Arbor, MI