Skip to content

Commit d487406

Browse files
Rework risk control theoretical description (#743)
1 parent 26cf8bc commit d487406

File tree

6 files changed

+113
-30
lines changed

6 files changed

+113
-30
lines changed

HISTORY.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ History
66
------------------
77

88
* Fix warnings when running tests
9+
* Rework risk control documentation
910
* Fix incorrect URL in PrecisionRecallController docstring
1011
* Delete redundant risk control notebooks
1112
* Add link to Thibault Cordier's repository on risk control
124 KB
Loading
84.9 KB
Loading

doc/images/plot_alpha.png

35.9 KB
Loading

doc/images/risk_distribution.png

71.9 KB
Loading

doc/theoretical_description_risk_control.rst

Lines changed: 112 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,105 @@
1-
.. title:: Theoretical Description Recall and Precision Control for Multi label Classification : contents
1+
.. title:: Getting started with risk control in MAPIE : contents
22

33
.. _theoretical_description_risk_control:
44

5-
#######################
6-
Theoretical Description
7-
#######################
5+
############################################
6+
Getting started with risk control in MAPIE
7+
############################################
88

9-
Note: in theoretical parts of this documentation, we use the terms *calibrate* and *calibration* employed in the scientific literature, that are equivalent to *conformalize* and *conformalization*.
9+
.. contents:: Table of contents
10+
:depth: 2
11+
:local:
12+
13+
Overview
14+
========
15+
16+
Three methods of risk control have been implemented in MAPIE so far :
17+
**Risk-Controlling Prediction Sets** (RCPS) [1], **Conformal Risk Control** (CRC) [2] and **Learn Then Test** (LTT) [3].
18+
The difference between these methods is the way the conformity scores are computed.
19+
20+
As of now, MAPIE supports risk control for two machine learning tasks: **binary classification**, as well as **multi-label classification** (including applications like image segmentation).
21+
The table below details the available methods for each task:
22+
23+
.. list-table:: Available risk control methods in MAPIE for each ML task
24+
:header-rows: 1
25+
26+
* - Risk control method
27+
- Binary classification
28+
- Multi-label classification (image segmentation)
29+
* - RCPS
30+
- ❌
31+
- ✅
32+
* - CRC
33+
- ❌
34+
- ✅
35+
* - LTT
36+
- ✅
37+
- ✅
38+
39+
In MAPIE for multi-label classification, CRC and RCPS are used for recall control, while LTT is used for precision control.
40+
41+
1. What is risk control?
42+
========================
43+
44+
Before diving into risk control, let's take the simple example of a binary classification model, which separates the incoming data into the two classes thanks to its threshold: predictions above it are classified as 1, and those below as 0. Suppose we want to find a threshold that guarantees that our model achieves a certain level of precision. A naive, yet straightforward approach to do this is to evaluate how precision varies with different threshold values on a validation dataset. By plotting this relationship (see plot below), we can identify the range of thresholds that meet our desired precision requirement (green zone on the graph).
45+
46+
.. image:: images/example_without_risk_control.png
47+
:width: 600
48+
:align: center
49+
50+
So far, so good. But here is the catch: while the chosen threshold effectively keeps precision above the desired level on the validation data, it offers no guarantee on the precision of the model when faced with new, unseen data. That is where risk control comes into play.
51+
52+
53+
54+
Risk control is the science of adjusting a model's parameter, typically denoted :math:`\lambda`, so that a given risk stays below a desired level with high probability on unseen data.
55+
Note that here, the term *risk* is used to describe an undesirable outcome of the model (e.g., type I error): therefore, it is a value we want to minimize, and in our case, keep under a certain level. Also note that risk control can easily be applied to metrics we want to maximize (e.g., precision), simply by controlling the complement (e.g., 1-precision).
56+
57+
The strength of risk control lies in the statistical guarantees it provides on unseen data. Unlike the naive method presented earlier, it determines a value of :math:`\lambda` that ensures the risk is controlled *beyond* the training data.
58+
59+
Applying risk control to the previous example would allow us to get a new — albeit narrower — range of thresholds (blue zone on the graph) that are **statistically guaranteed**.
60+
61+
.. image:: images/example_with_risk_control.png
62+
:width: 600
63+
:align: center
64+
65+
This guarantee is critical in a wide range of use cases, especially in high-stakes applications. Take, for example, medical diagnosis: here, the parameter :math:`\lambda` is the binarization threshold that determines whether a patient is classified as sick. We aim to minimize false negatives (i.e., cases where sick patients are incorrectly diagnosed as healthy), which corresponds to controlling the type II error. In this setting, risk control allows us to find a :math:`\lambda` such that, on future patients, the model’s type II error does not exceed, say, 5%, with high confidence.
1066

1167
1268

13-
Three methods for multi-label uncertainty quantification have been implemented in MAPIE so far :
14-
Risk-Controlling Prediction Sets (RCPS) [1], Conformal Risk Control (CRC) [2] and Learn Then Test (LTT) [3].
15-
The difference between these methods is the way the conformity scores are computed.
69+
To express risk control in mathematical terms, we denote by R the risk we want to control, and introduce the following two parameters:
70+
71+
- :math:`\alpha`: the target level below which we want the risk to remain, as shown in the figure below;
72+
73+
.. image:: images/plot_alpha.png
74+
:width: 600
75+
:align: center
76+
77+
- :math:`\delta`: the confidence level associated with the risk control.
78+
79+
In other words, the risk is said to be controlled if :math:`R \leq \alpha` with probability at least :math:`1 - \delta`.
80+
81+
Furthermore, there exist two types of risk control in terms of guarantees they give.
82+
83+
- Guarantee on the expectation of the risk: :math:`\mathbb{E}(R) \leq \alpha` → CRC;
1684

17-
For a multi-label classification problem in a standard independent and identically distributed (i.i.d) case,
85+
- Guarantee on the probability that the risk does not exceed :math:`\alpha`: :math:`\mathbb{P}(R \leq \alpha) \geq 1 - \delta` → RCPS/LTT.
86+
87+
.. image:: images/risk_distribution.png
88+
:width: 600
89+
:align: center
90+
91+
The plot above gives a visual representation of the difference between the two types of guarantees:
92+
93+
- The risk is controlled in expectation (CRC) if the mean of its distribution over unseen data is below :math:`\alpha`;
94+
95+
- The risk is controlled in probability (RCPS/LTT) if at least :math:`1 - \delta` percent of its distribution over unseen data is below :math:`\alpha`.
96+
97+
For a classification problem in a standard independent and identically distributed (i.i.d) case,
1898
our training data :math:`(X, Y) = \{(x_1, y_1), \ldots, (x_n, y_n)\}`` has an unknown distribution :math:`P_{X, Y}`.
1999

20-
For any risk level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user to construct a prediction
100+
For any target level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user to construct a prediction
21101
set :math:`\hat{C}_{n, \alpha}(X_{n+1})` for a new observation :math:`\left( X_{n+1},Y_{n+1} \right)` with a guarantee
22-
on the recall. RCPS, LTT, and CRC give three slightly different guarantees:
102+
on the specified risk. As mentioned above, RCPS, LTT, and CRC give three slightly different guarantees:
23103

24104
- RCPS:
25105

@@ -37,14 +117,16 @@ on the recall. RCPS, LTT, and CRC give three slightly different guarantees:
37117
\mathbb{P}(R(\mathcal{T}_{\hat{\lambda}}) \leq \alpha ) \geq 1 - \delta \quad \texttt{with} \quad p_{\hat{\lambda}} \leq \frac{\delta}{\lvert \Lambda \rvert}
38118
39119
40-
Notice that at the opposite of the other two methods, LTT allows to control any non-monotone loss. In MAPIE for multilabel classification,
41-
we use CRC and RCPS for recall control and LTT for precision control.
120+
Notice that at the opposite of the other two methods, LTT allows to control any non-monotonic risk.
42121

43-
1. Risk-Controlling Prediction Sets
44-
===================================
45-
1.1. General settings
46-
---------------------
122+
The following section provides a detailed overview of each method.
47123

124+
2. Theoretical description
125+
==========================
126+
2.1 Risk-Controlling Prediction Sets
127+
------------------------------------
128+
2.1.1 General settings
129+
^^^^^^^^^^^^^^^^^^^^^^
48130

49131
Let's first give the settings and the notations of the method:
50132

@@ -81,8 +163,8 @@ Following those settings, the RCPS method gives the following guarantee on the r
81163
\mathbb{P}(R(\mathcal{T}_{\hat{\lambda}}) \leq \alpha ) \geq 1 - \delta
82164
83165
84-
1.2. Bounds calculation
85-
-----------------------
166+
2.1.2 Bounds calculation
167+
^^^^^^^^^^^^^^^^^^^^^^^^
86168

87169
In this section, we will consider only bounded losses (as for now, only the :math:`1-recall` loss is implemented).
88170
We will show three different Upper Calibration Bounds (UCB) (Hoeffding, Bernstein, and Waudby-Smith–Ramdas) of :math:`R(\lambda)`
@@ -92,8 +174,8 @@ based on the empirical risk which is defined as follows:
92174
\hat{R}(\lambda) = \frac{1}{n}\sum_{i=1}^n L(Y_i, T_{\lambda}(X_i))
93175
94176
95-
1.2.1. Hoeffding Bound
96-
----------------------
177+
2.1.2.1 Hoeffding Bound
178+
"""""""""""""""""""""""
97179

98180
Suppose the loss is bounded above by one, then we have by the Hoeffding inequality that:
99181

@@ -106,8 +188,8 @@ Which implies the following UCB:
106188
\hat{R}_{Hoeffding}^+(\lambda) = \hat{R}(\lambda) + \sqrt{\frac{1}{2n}\log\frac{1}{\delta}}
107189
108190
109-
1.2.2. Bernstein Bound
110-
----------------------
191+
2.1.2.2 Bernstein Bound
192+
"""""""""""""""""""""""
111193

112194
Contrary to the Hoeffding bound, which can sometimes be too simple, the Bernstein UCB takes into account the variance
113195
and gives a smaller prediction set size:
@@ -121,8 +203,8 @@ Where:
121203
\hat{\sigma}(\lambda) = \frac{1}{n-1}\sum_{i=1}^n(L(Y_i, T_{\lambda}(X_i)) - \hat{R}(\lambda))^2
122204
123205
124-
1.2.3. Waudby-Smith–Ramdas
125-
--------------------------
206+
2.1.2.3 Waudby-Smith–Ramdas
207+
"""""""""""""""""""""""""""
126208

127209
This last UCB is the one recommended by the authors of [1] to use when using a bounded loss as this is the one that gives
128210
the smallest prediction sets size while having the same risk guarantees. This UCB is defined as follows:
@@ -145,8 +227,8 @@ Then:
145227
\hat{R}_{WSR}^+(\lambda) = \inf \{ R \geq 0 : \max_{i=1,...n} K_i(R, \lambda) > \frac{1}{\delta}\}
146228
147229
148-
2. Conformal Risk Control
149-
=========================
230+
2.2 Conformal Risk Control
231+
--------------------------
150232

151233
The goal of this method is to control any monotone and bounded loss. The result of this method can be expressed as follows:
152234

@@ -168,8 +250,8 @@ With :
168250
\hat{R}_n (\lambda) = (L_{1}(\lambda) + ... + L_{n}(\lambda)) / n
169251
170252
171-
3. Learn Then Test
172-
==================
253+
2.3 Learn Then Test
254+
-------------------
173255

174256
We are going to present the Learn Then Test framework that allows the user to control non-monotonic risk such as precision score.
175257
This method has been introduced in article [3].
@@ -206,7 +288,7 @@ References
206288

207289
[1] Lihua Lei Jitendra Malik Stephen Bates, Anastasios Angelopoulos,
208290
and Michael I. Jordan. Distribution-free, risk-controlling prediction
209-
sets. CoRR, abs/2101.02703, 2021. URL https://arxiv.org/abs/2101.02703.39
291+
sets. CoRR, abs/2101.02703, 2021. URL https://arxiv.org/abs/2101.02703
210292

211293
[2] Angelopoulos, Anastasios N., Stephen, Bates, Adam, Fisch, Lihua,
212294
Lei, and Tal, Schuster. "Conformal Risk Control." (2022).

0 commit comments

Comments
 (0)