PyData Amsterdam 2024

Boosting AI Reliability: Uncertainty Quantification with MAPIE
09-20, 10:05–10:55 (Europe/Amsterdam), Escher

MAPIE (Model Agnostic Prediction Interval Estimator) is your go-to solution for managing uncertainties and risks in machine learning models. This Python library, nestled within scikit-learn-contrib, offers a way to calculate prediction intervals with controlled coverage rates for regression, classification, and even time series analysis. But it doesn't stop there - MAPIE can also be used to handle more complex tasks like multi-label classification and semantic segmentation in computer vision, ensuring probabilistic guarantees on crucial metrics like recall and precision. MAPIE can be integrated with any model - whether it's scikit-learn, TensorFlow, or PyTorch. Join us as we delve into the world of conformal predictions and how to quickly manage your uncertainties using MAPIE.

Link to Github: https://github.com/scikit-learn-contrib/MAPIE


This talk introduces MAPIE, an open-source Python library designed to quantify uncertainties and control risks in machine learning models. Learn how to compute conformal prediction intervals and control risks in various tasks such as regression, classification, time series, and even complex tasks like multi-label classification and semantic segmentation.

We will begin by discussing the importance of uncertainty quantification and risk control in machine learning models. Then, we will dive into the key features of MAPIE, including:

  1. Computing conformal prediction intervals for regression, classification, and time series tasks with guaranteed marginal coverage rates.
  2. Controlling risks for complex tasks such as multi-label classification and semantic segmentation in computer vision.
  3. Wrapping any machine learning model (scikit-learn, TensorFlow, PyTorch, etc.) with a scikit-learn-compatible wrapper for uncertainty quantification and risk control.

Throughout the talk, we will demonstrate MAPIE's capabilities with practical examples and code snippets. Attendees will learn how to apply MAPIE to their own models, ensuring more reliable and robust predictions.

This talk targets data scientists, machine learning engineers, and researchers with a basic understanding of machine learning concepts. Familiarity with scikit-learn and other popular machine learning libraries is helpful but not required.

By the end of the talk, attendees will have gained valuable insights into uncertainty quantification and risk control, as well as hands-on experience using MAPIE to bring uncertainty quantification to their machine learning models.

As an econometrician at the Erasmus School of Economics and later a data scientist at Bocconi, my journey in uncertainty quantification began during my internship at Quantmetry, where I contributed to the development of MAPIE. I implemented conformalized quantile regression and became a core developer of the library, collaborating closely with Thibault Cordier, Vincent Blot, and Candice Moyet.

Currently, at Capgemini Invent's R&I lab, we continue to advance MAPIE while also exploring cutting-edge topics such as hallucinations in large language models (LLMs) and combining knowledge graphs with LLMs.

Thibault Cordier is a Data and Research Scientist at Capgemini Invent, where he is a member of the Lab Invent team in France and serves as the technical leader of the MAPIE project.

Prior to joining the research team at Capgemini Invent, he earned his PhD in Computer Science in 2023 at Avignon University.

Up to now, his research has focused on distribution-free inference and conformal prediction, with applications in computer vision, natural language processing, and time series analysis.