Literature
Literature
A working bibliography of conformal prediction, by topic. For papers that put it to work where coverage is genuinely the objective, see the applications page.
Curated with one filter, in keeping with the rest of the guide: keep the careful methodology and the foundational theory, and the work that is precise about what the guarantee does and does not give; skip material that pitches conformal prediction as improving a forecast or delivering “reliable uncertainty” with the marginal-coverage caveat left out. Descriptions are our own one-line readings, not the papers’ abstracts. Corrections welcome.
Start here: surveys, books, foundations
- Angelopoulos & Bates (2023). A gentle introduction to conformal prediction and distribution-free uncertainty quantification. FnT in ML. arXiv:2107.07511 — the friendliest entry point, with code.
- Shafer & Vovk (2008). A tutorial on conformal prediction. JMLR. jmlr.org — self-contained, by a co-inventor.
- Vovk, Gammerman & Shafer (2005; 2nd ed. 2022). Algorithmic Learning in a Random World. Springer. link — the founding monograph.
- Angelopoulos, Barber & Bates (2024). Theoretical foundations of conformal prediction. arXiv:2411.11824 — the exchangeability/permutation machinery in full.
- Fontana, Zeni & Vantini (2023). Conformal prediction: a unified review of theory and new challenges. Bernoulli. arXiv:2005.07972
- Campos, Farinhas, Zerva, Figueiredo & Martins (2024). Conformal prediction for natural language processing: a survey. TACL. arXiv:2405.01976
- Manokhin (2023). Practical Guide to Applied Conformal Prediction in Python. Packt (book).
Origins and the core method
- Vovk, Gammerman & Saunders (1999). Machine-learning applications of algorithmic randomness. ICML — the precursor: confidence from algorithmic randomness.
- Saunders, Gammerman & Vovk (1999). Transduction with confidence and credibility. IJCAI — transductive (full) conformal in embryo.
- Papadopoulos, Proedrou, Vovk & Gammerman (2002). Inductive confidence machines for regression. ECML — the split conformal predictor, the cheap variant now standard.
- Vovk (2015). Cross-conformal predictors. Ann. Math. Artif. Intell. — a CV/inductive hybrid.
- Lei, G’Sell, Rinaldo, Tibshirani & Wasserman (2018). Distribution-free predictive inference for regression. JASA. arXiv:1604.04173 — the canonical full/split/locally-weighted regression treatment.
Validity, exchangeability, and the limits
- Lei & Wasserman (2014). Distribution-free prediction bands for non-parametric regression. JRSS-B. doi — the first no-go: finite-length conditional coverage is impossible distribution-free.
- Foygel Barber, Candès, Ramdas & Tibshirani (2021). The limits of distribution-free conditional predictive inference. Information and Inference. arXiv:1903.04684 — even approximate subgroup coverage is essentially unattainable.
- Vovk (2012). Conditional validity of inductive conformal predictors. ACML. arXiv:1209.2673 — object/label-conditional validity, and the calibration-conditional Beta law.
- Bian & Barber (2023). Training-conditional coverage for distribution-free predictive inference. EJS. arXiv:2205.03647 — split has it; full and jackknife+ do not, without assumptions.
- Liang & Barber (2025). Algorithmic stability implies training-conditional coverage. Annals of Statistics. arXiv:2311.04295
- Kuchibhotla (2020). Exchangeability, conformal prediction, and rank tests. arXiv:2005.06095 — conformal as a classical rank test.
- Marques F. (2025). Universal distribution of the empirical coverage in split conformal prediction. Stat. & Prob. Letters. arXiv:2303.02770 — the exact Beta law of calibration-conditional coverage.
Game-theoretic foundations, e-values, exchangeability testing
- Shafer & Vovk (2019). Game-Theoretic Foundations for Probability and Finance. Wiley — the martingale-betting backdrop for conformal testing.
- Vovk, Nouretdinov & Gammerman (2003). Testing exchangeability on-line. ICML — conformal test martingales for monitoring the assumption.
- Fedorova, Gammerman, Nouretdinov & Vovk (2012). Plug-in martingales for testing exchangeability on-line. ICML. arXiv:1204.3251
- Vovk & Wang (2021). E-values: calibration, combination, and applications. Annals of Statistics. arXiv:1912.06116
- Vovk (2025). Randomness, exchangeability, and conformal prediction. arXiv:2501.11689
Regression and interval methods
- Romano, Patterson & Candès (2019). Conformalized quantile regression. NeurIPS. arXiv:1905.03222 — conformalize a quantile model for heteroscedastic-adaptive intervals.
- Foygel Barber, Candès, Ramdas & Tibshirani (2021). Predictive inference with the jackknife+. Annals of Statistics. arXiv:1905.02928 — jackknife+ and CV+, leave-one-out with provable coverage.
- Gupta, Kuchibhotla & Ramdas (2022). Nested conformal prediction and quantile out-of-bag ensembles. Pattern Recognition. arXiv:1910.10562 — conformal as calibrating nested sets.
- Papadopoulos, Gammerman & Vovk (2008). Normalized nonconformity measures for regression conformal prediction. IASTED AIA — locally-weighted, variable-width intervals.
- Sesia & Candès (2020). A comparison of some conformal quantile regression methods. Stat. arXiv:1909.05433
- Kivaranovic, Johnson & Leeb (2020). Adaptive, distribution-free prediction intervals for deep networks. AISTATS. PMLR
- Sesia & Romano (2021). Conformal prediction using conditional histograms. NeurIPS. arXiv:2105.08747 — shortest intervals from a conditional histogram.
Conditional coverage: methods and guarantees
- Romano, Barber, Sabatti & Candès (2020). With malice toward none: assessing uncertainty via equalized coverage. HDSR. arXiv:1908.05428 — equal coverage across protected groups.
- Guan (2023). Localized conformal prediction. Biometrika. doi — reweight calibration toward a local neighbourhood.
- Hore & Barber (2025). Conformal prediction with local weights: randomization enables local guarantees. JRSS-B. arXiv:2310.07850 — randomized local weights buy a genuine local guarantee.
- Jung, Noarov, Ramalingam & Roth (2023). Batch multivalid conformal prediction. ICLR. arXiv:2209.15145 — the multicalibration bridge.
- Cauchois, Gupta & Duchi (2021). Knowing what you know: valid and validated confidence sets. JMLR. arXiv:2004.10181 — the worst-slab coverage diagnostic.
- Gibbs, Cherian & Candès (2025). Conformal prediction with conditional guarantees. JRSS-B. arXiv:2305.12616 — exact coverage over a finite-dimensional class of shifts.
Conditional coverage: diagnostics and calibration tests
- Diebold, Gunther & Tay (1998). Evaluating density forecasts. Int. Economic Review — the PIT: a calibrated forecast gives i.i.d. uniform PIT values.
- Gneiting, Balabdaoui & Raftery (2007). Probabilistic forecasts, calibration and sharpness. JRSS-B — the maximize-sharpness-subject-to-calibration frame.
- Widmann, Lindsten & Zachariah (2019). Calibration tests in multi-class classification. NeurIPS. arXiv:1910.11385 — kernel calibration error with a hypothesis test.
- Widmann, Lindsten & Zachariah (2021). Calibration tests beyond classification. ICLR. arXiv:2210.13355
- Feldman, Bates & Romano (2021). Improving conditional coverage via orthogonal quantile regression. NeurIPS. arXiv:2106.00394 — HSIC between the miscoverage indicator and interval length.
- Braun, Holzmüller, Jordan & Bach (2025). Conditional coverage diagnostics for conformal prediction (ERT). arXiv:2512.11779 — classify the coverage indicator on the features.
- Cotton (2026). A Feynman–Wigner diagnostic for conformal prediction. this site — distance covariance between the conformal rank and the features, the runnable, level-free check.
Classification
- Sadinle, Lei & Wasserman (2019). Least ambiguous set-valued classifiers with bounded error levels. JASA. arXiv:1609.00451 — smallest sets at a fixed error level, marginal or per-class.
- Romano, Sesia & Candès (2020). Classification with valid and adaptive coverage (APS). NeurIPS. arXiv:2006.02544
- Angelopoulos, Bates, Jordan & Malik (2021). Uncertainty sets for image classifiers using conformal prediction (RAPS). ICLR. arXiv:2009.14193
- Ding, Angelopoulos, Bates, Jordan & Tibshirani (2023). Class-conditional conformal prediction with many classes (clustered). NeurIPS. arXiv:2306.09335
- Vovk & Petej (2014). Venn–Abers predictors. UAI. arXiv:1211.0025 — calibrated probability intervals via isotonic regression.
- Vovk, Lindsay, Nouretdinov & Gammerman (2003). Mondrian confidence machine. RHUL tech report — per-category (label-conditional) validity.
Risk control and beyond coverage
- Bates, Angelopoulos, Lei, Malik & Jordan (2021). Distribution-free, risk-controlling prediction sets (RCPS). JACM. arXiv:2101.02703 — control any bounded loss with high probability.
- Angelopoulos, Bates, Fisch, Lei & Schuster (2024). Conformal risk control. ICLR. arXiv:2208.02814 — control the expectation of any monotone loss; coverage is the 0–1 case.
- Angelopoulos, Bates, Candès, Jordan & Lei (2021). Learn then test: calibrating predictive algorithms to achieve risk control. arXiv:2110.01052 — risk control as multiple testing over configurations.
- Laufer-Goldshtein, Fisch, Barzilay & Jaakkola (2023). Efficiently controlling multiple risks with Pareto testing. ICLR. arXiv:2210.07913
Selection, outliers, and novelty
- Bates, Candès, Lei, Romano & Sesia (2023). Testing for outliers with conformal p-values. Annals of Statistics. arXiv:2104.08279 — conformal prediction’s most natural home: a distribution-free test.
- Marandon, Lei, Mary & Roquain (2024). Adaptive novelty detection with false discovery rate guarantee (AdaDetect). Annals of Statistics. arXiv:2208.06685
- Guan & Tibshirani (2022). Prediction and outlier detection in classification problems (BCOPS). JRSS-B. arXiv:1905.04396
- Jin & Candès (2023). Selection by prediction with conformal p-values. JMLR. arXiv:2210.01408 — shortlist with FDR control.
- Laxhammar & Falkman (2015). Inductive conformal anomaly detection for sequential trajectories. Ann. Math. Artif. Intell. — calibrated alarm rates online.
Conformal training and efficiency
- Stutz, Dvijotham, Cemgil & Doucet (2022). Learning optimal conformal classifiers (ConfTr). ICLR. arXiv:2110.09192 — differentiate through conformalization to train for small sets.
- Einbinder, Romano, Sesia & Zhou (2022). Training uncertainty-aware classifiers with conformalized deep learning. NeurIPS. arXiv:2205.05878
- Yang & Kuchibhotla (2025). Selection and aggregation of conformal prediction sets. JASA. arXiv:2104.13871 — pick the score that gives the smallest valid set.
- Kiyani, Pappas & Hassani (2024). Length optimization in conformal prediction. NeurIPS. arXiv:2406.18814
- Einbinder, Feldman, Bates, Angelopoulos, Gendler & Romano (2022). Label noise robustness of conformal prediction. arXiv:2209.14295
- Gendler, Weng, Daniel & Romano (2022). Adversarially robust conformal prediction (RSCP). ICLR. openreview
Distribution shift and weighted conformal
- Tibshirani, Foygel Barber, Candès & Ramdas (2019). Conformal prediction under covariate shift. NeurIPS. arXiv:1904.06019 — weighted exchangeability when the likelihood ratio is known.
- Podkopaev & Ramdas (2021). Distribution-free uncertainty quantification for classification under label shift. UAI. arXiv:2103.03323
- Park, Dobriban, Lee & Bastani (2022). PAC prediction sets under covariate shift. ICLR. arXiv:2106.09848
- Prinster, Liu & Saria (2022). JAWS: auditing predictive uncertainty under covariate shift. NeurIPS. arXiv:2207.10716 — weighted jackknife+.
Time series and non-exchangeability
- Chernozhukov, Wüthrich & Zhu (2018). Exact and robust conformal inference for dependent data. COLT. arXiv:1802.06300 — block permutations.
- Stankevičiūtė, Alaa & van der Schaar (2021). Conformal time-series forecasting. NeurIPS. proceedings
- Xu & Xie (2021). Conformal prediction interval for dynamic time-series (EnbPI). ICML. arXiv:2010.09107
- Xu & Xie (2023). Sequential predictive conformal inference for time series (SPCI). ICML. arXiv:2212.03463
- Lin, Trivedi & Sun (2022). Conformal prediction with temporal quantile adjustments (TQA). NeurIPS. arXiv:2205.09940
- Sun & Yu (2024). Copula conformal prediction for multi-step time series (CopulaCPTS). ICLR. arXiv:2212.03281
- Auer, Gauch, Klotz & Hochreiter (2023). Conformal prediction for time series with modern Hopfield networks (HopCPT). NeurIPS. arXiv:2303.12783
- Barber, Candès, Ramdas & Tibshirani (2023). Conformal prediction beyond exchangeability. Annals of Statistics. arXiv:2202.13415 — the coverage gap bounded by a total-variation term.
- Oliveira, Orenstein, Ramos & Romano (2024). Split conformal prediction and non-exchangeable data. JMLR. arXiv:2203.15885
- Barber & Pananjady (2026). Predictive inference for time series: why is split conformal effective despite temporal dependence? ALT. arXiv:2510.02471
Online and adaptive
- Gibbs & Candès (2021). Adaptive conformal inference under distribution shift (ACI). NeurIPS. arXiv:2106.00170 — long-run time-average coverage, no distributional assumptions.
- Gibbs & Candès (2024). Conformal inference for online prediction with arbitrary distribution shifts. JMLR. arXiv:2208.08401
- Zaffran, Féron, Goude, Josse & Dieuleveut (2022). Adaptive conformal predictions for time series (AgACI). ICML. arXiv:2202.07282
- Bhatnagar, Wang, Xiong & Bai (2023). Improved online conformal prediction via strongly adaptive online learning. ICML. arXiv:2302.07869
- Angelopoulos, Candès & Tibshirani (2023). Conformal PID control for time series prediction. NeurIPS. arXiv:2307.16895
Conformal predictive distributions and systems
- Vovk, Shen, Manokhin & Xie (2019). Nonparametric predictive distributions based on conformal prediction. Machine Learning. link — a full predictive distribution with finite-sample validity.
- Vovk, Nouretdinov, Manokhin & Gammerman (2018). Cross-conformal predictive distributions. COPA. PMLR
- Vovk, Petej, Nouretdinov, Manokhin & Gammerman (2019). Computationally efficient versions of conformal predictive distributions. Neurocomputing. arXiv:1911.00941
- Boström, Johansson & Löfström (2021). Mondrian conformal predictive distributions. COPA. PMLR — shape and location vary per region.
- Boström (2022). crepes: a Python package for conformal regressors and predictive systems. COPA. PMLR
Connections: information theory, proper scores, Bayes, fiducial
- Correia, Massoli, Louizos & Behboodi (2024). An information theoretic perspective on conformal prediction. NeurIPS. arXiv:2405.02140 — set size bounds the conditional entropy.
- Hoff (2023). Bayes-optimal prediction with frequentist coverage control. Bernoulli. arXiv:2105.14045
- Fong & Holmes (2021). Conformal Bayesian computation. NeurIPS. arXiv:2106.06137
- Fong, Holmes & Walker (2023). Martingale posteriors. JRSS-B (with discussion) — Bayesian uncertainty as uncertainty over future data.
- Cella & Martin (2022). Validity, consonant plausibility measures, and conformal prediction. Int. J. Approx. Reasoning. arXiv:2001.09225 — the fiducial / imprecise-probability reading.
- Gneiting & Raftery (2007). Strictly proper scoring rules, prediction, and estimation. JASA — the log score and CRPS as the right graders.
- Cotton. Marginally Useful, the Feynman–Wigner note, Betting Against a Conformal Predictor, and The Width of the Conformal Fan. this site — the coverage–sharpness gap as the mutual information I(R;X), the sign of the de Finetti measure, the gap as the parimutuel rent an X-informed bettor extracts, and the sign of dependence as the width of the realized-coverage fan.
Causal inference, treatment effects, and survival
- Lei & Candès (2021). Conformal inference of counterfactuals and individual treatment effects. JRSS-B. arXiv:2006.06138
- Candès, Lei & Ren (2023). Conformalized survival analysis. JRSS-B. arXiv:2103.09763 — calibrated lower bounds on survival time under censoring.
- Gui, Hore, Ren & Barber (2024). Conformalized survival analysis with adaptive cut-offs. Biometrika. arXiv:2211.01227
- Jin, Ren & Candès (2023). Sensitivity analysis of individual treatment effects: a robust conformal inference approach. PNAS. arXiv:2111.12161
- Yin, Shi, Wang & Blei (2024). Conformal sensitivity analysis for individual treatment effects. JASA. arXiv:2112.03493
Language models and generation
- Quach, Fisch, Schuster, Yala, Sohn, Jaakkola & Barzilay (2024). Conformal language modeling. ICLR. arXiv:2306.10193 — a calibrated stopping rule so a generated set contains an acceptable answer.
- Mohri & Hashimoto (2024). Language models with conformal factuality guarantees. ICML. arXiv:2402.10978 — back off to less specific claims until factuality holds.
- Kumar, Lu, Gupta, et al. (2023). Conformal prediction with large language models for multi-choice question answering. arXiv:2305.18404
- Su, Luo, Wang & Cheng (2024). API is enough: conformal prediction for LLMs without logit-access. EMNLP Findings. arXiv:2403.01216
- Ren et al. (2023). Robots that ask for help: uncertainty alignment for LLM planners (KnowNo). CoRL. arXiv:2307.01928
Software
- Python: MAPIE, crepes, TorchCP, PUNCC, Fortuna, conformal-prediction (examples), nonconformist (legacy).
- R: conformalInference, probably (tidymodels), AdaptiveConformal. Julia: ConformalPrediction.jl.
Talks
- Candès. Conformal prediction in 2022 (NeurIPS keynote). neurips.cc
- Jordan, Vovk & Wasserman, moderated by Ramdas. Panel (ICML 2021). slideslive
Applications
Papers that use conformal prediction where coverage or containment is genuinely the objective, selective prediction, anomaly detection, retrieval, language models, robotics and control, risk control, scientific discovery, causal inference, survival analysis, and medical imaging, are listed and discussed, by domain, on the applications page.
Using conformal prediction in your own project? Tell Claude: “Read https://conformalprediction.net/SKILL.md and create a project skill from it.” It adds a check for whether your coverage is conditionally trustworthy.