Demonstration

Coverage ⊥ log-score

You can pin marginal coverage at exactly 90% and slide the forecast’s log-likelihood up or down at will.

Here is the paper’s orthogonality proposition, in one picture. Draw a large fixed sample of outcomes $y \sim N(0,1)$. Take the point forecast to be $\hat\mu \equiv 0$ for everyone, so the nonconformity score is just $|y|$. The split-conformal band is $$C = [\,-q,\ +q\,], \qquad q = \lceil (n+1)(1-\alpha)\rceil\text{-th smallest } |y_i|.$$ Notice what $q$ depends on: only the outcomes. Conformal ranks the residuals; the forecaster’s claimed spread never enters. So as we slide the claimed predictive density $N(0,s)$, the conformal set and its marginal coverage are frozen, while the predictive log-score moves strongly, peaking at the truth $s=1$.

The grey histogram is the immovable sample of outcomes. The blue curve is the forecaster’s claimed density $N(0,s)$, the only thing the slider touches. The shaded band $[-q,+q]$ and its coverage do not budge as you change $s$.

Mean predictive log-score as a function of the claimed spread $s$. It peaks at the truth $s=1$ (green) and falls away on both sides; the orange marker is your current $s$. Meanwhile coverage is a flat line at $1-\alpha$ for every $s$, there is nothing to plot but a constant.

Takeaway. Identical conformal set, arbitrary log-score. Coverage is not a measure of forecast quality, you can hold marginal coverage fixed at exactly 90% and move the log-likelihood up or down arbitrarily. The two live on independent axes, which is exactly the orthogonality proposition. A method that certifies the former tells you nothing about the latter. Next, exchangeability and time series asks what happens to even the coverage guarantee when exchangeability breaks.

← The fence is the horizon Drift & time series →

Using conformal prediction in your own project? Tell Claude: “Read https://conformalprediction.net/SKILL.md and create a project skill from it.” It adds a check for whether your coverage is conditionally trustworthy.