Conformal Prediction

Demonstration

Coverage ⊥ log-score

You can pin marginal coverage at exactly 90% and slide the forecast’s log-likelihood up or down at will.

Here is the paper’s orthogonality proposition, made literal. Draw a large fixed sample of outcomes \(y \sim N(0,1)\). Take the point forecast to be \(\hat\mu \equiv 0\) for everyone, so the nonconformity score is just \(|y|\). The split-conformal band is $$C = [\,-q,\ +q\,], \qquad q = \lceil (n+1)(1-\alpha)\rceil\text{-th smallest } |y_i|.$$ Notice what \(q\) depends on: only the outcomes. Conformal ranks the residuals; the forecaster’s claimed spread never enters. So as we slide the claimed predictive density \(N(0,s)\), the conformal set and its marginal coverage are frozen, while the predictive log-score moves strongly, peaking at the truth \(s=1\).

The grey histogram is the immovable sample of outcomes. The blue curve is the forecaster’s claimed density \(N(0,s)\), the only thing the slider touches. The shaded band \([-q,+q]\) and its coverage do not budge as you change \(s\).

Mean predictive log-score as a function of the claimed spread \(s\). It peaks at the truth \(s=1\) (green) and falls away on both sides; the orange marker is your current \(s\). Meanwhile coverage is a flat line at \(1-\alpha\) for every \(s\), there is nothing to plot but a constant.

Takeaway. Identical conformal set, arbitrary log-score. Coverage is not a measure of forecast quality, you can hold marginal coverage fixed at exactly 90% and move the log-likelihood up or down arbitrarily. The two live on independent axes, which is exactly the orthogonality proposition. A method that certifies the former tells you nothing about the latter. Next, exchangeability and time series asks what happens to even the coverage guarantee when exchangeability breaks.