Demonstration
A Thurstone contest at the −1/n floor
Where the previous demo showed the mixing measure go negative, this one shows what that does to conformal prediction: the marginal coverage holds, the per-case coverage does not.
A field of \(M\) competitors with latent strengths is the most natural place to hit the negative-association floor. Exactly one wins, the ranking is a permutation, and the outcomes carry correlation \(\rho=-\tfrac{1}{M-1}\) — the most negative an exchangeable family allows. This is the signed corner of the previous demo, now with competitors.
Take a relative score across the field and conformalize it: draw a calibration sample of competitors, set the interval from their scores, and test it on another. Because the field is fixed, a strong calibration sample leaves a weaker test competitor behind, and vice versa. The marginal coverage lands on target anyway. The per-case coverage does not.
The field: \(M\) competitors by latent strength, the strong and the weak. These are the “implied strengths” you would recover from how often each one wins — the modelling a coverage certificate cannot do for you.
Coverage by how strong the calibration sample happened to be. The dashed line is the marginal coverage, near target whatever the field size. The curve is the per-case coverage: conformal over-covers when calibration was strong and under-covers when it was weak. Shrink \(M\) toward the floor and the curve fans wider.
The certificate is honest on average and quietly wrong case by case, and the size of the error is set by how negative the dependence is — here \(-\tfrac{1}{M-1}\). The strength you would read off the field, not the conformal wrapper, is what tells you who wins. The companion note works this out.
Using conformal prediction in your own project? Tell Claude: “Read https://conformalprediction.net/SKILL.md and create a project skill from it.” It adds a check for whether your coverage is conditionally trustworthy.