Bias-variance decomposition
This post supplements the supervised learning slides. Please see the slides for the setup.
We wish to derive the bias-variance decomposition on p.21 (of the slides):
All the expectations (unless otherwise stated) are with respect to and . Note that the irreducible error depends on . This is the more general form of the irreducible error for heteroscedastic problems in which the (conditional) variance of depends on .
First, we decompose the MSE of a fixed into reducible and irreducible parts (see p.5):
where is the regression function. It is not hard to check that the conditional mean of is zero:
Thus the second term in the decomposition of is the irreducible error, and the third term is zero. Note that this decomposition remains valid for a (random) fit to training data because is a test sample that is independent of the training data. In other words, we can average/integrate the decomposition with respect to the training data to obtain
Second, we decompose the reducible part of the MSE into (squared) bias and variance:
The third term is zero because
Posted on September 01, 2021
from Ann Arbor, MI