You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
.. title:: Theoretical Description Recall and Precision Control for Multi label Classification : contents
1
+
.. title:: Getting started with risk control in MAPIE : contents
2
2
3
3
.. _theoretical_description_risk_control:
4
4
5
-
#######################
6
-
Theoretical Description
7
-
#######################
5
+
############################################
6
+
Getting started with risk control in MAPIE
7
+
############################################
8
8
9
-
Note: in theoretical parts of this documentation, we use the terms *calibrate* and *calibration* employed in the scientific literature, that are equivalent to *conformalize* and *conformalization*.
9
+
.. contents:: Table of contents
10
+
:depth: 2
11
+
:local:
12
+
13
+
Overview
14
+
========
15
+
16
+
Three methods of risk control have been implemented in MAPIE so far :
17
+
**Risk-Controlling Prediction Sets** (RCPS) [1], **Conformal Risk Control** (CRC) [2] and **Learn Then Test** (LTT) [3].
18
+
The difference between these methods is the way the conformity scores are computed.
19
+
20
+
As of now, MAPIE supports risk control for two machine learning tasks: **binary classification**, as well as **multi-label classification** (including applications like image segmentation).
21
+
The table below details the available methods for each task:
22
+
23
+
.. list-table:: Available risk control methods in MAPIE for each ML task
24
+
:header-rows: 1
25
+
26
+
* - Risk control method
27
+
- Binary classification
28
+
- Multi-label classification (image segmentation)
29
+
* - RCPS
30
+
- ❌
31
+
- ✅
32
+
* - CRC
33
+
- ❌
34
+
- ✅
35
+
* - LTT
36
+
- ✅
37
+
- ✅
38
+
39
+
In MAPIE for multi-label classification, CRC and RCPS are used for recall control, while LTT is used for precision control.
40
+
41
+
1. What is risk control?
42
+
========================
43
+
44
+
Before diving into risk control, let's take the simple example of a binary classification model, which separates the incoming data into the two classes thanks to its threshold: predictions above it are classified as 1, and those below as 0. Suppose we want to find a threshold that guarantees that our model achieves a certain level of precision. A naive, yet straightforward approach to do this is to evaluate how precision varies with different threshold values on a validation dataset. By plotting this relationship (see plot below), we can identify the range of thresholds that meet our desired precision requirement (green zone on the graph).
So far, so good. But here is the catch: while the chosen threshold effectively keeps precision above the desired level on the validation data, it offers no guarantee on the precision of the model when faced with new, unseen data. That is where risk control comes into play.
51
+
52
+
—
53
+
54
+
Risk control is the science of adjusting a model's parameter, typically denoted :math:`\lambda`, so that a given risk stays below a desired level with high probability on unseen data.
55
+
Note that here, the term *risk* is used to describe an undesirable outcome of the model (e.g., type I error): therefore, it is a value we want to minimize, and in our case, keep under a certain level. Also note that risk control can easily be applied to metrics we want to maximize (e.g., precision), simply by controlling the complement (e.g., 1-precision).
56
+
57
+
The strength of risk control lies in the statistical guarantees it provides on unseen data. Unlike the naive method presented earlier, it determines a value of :math:`\lambda` that ensures the risk is controlled *beyond* the training data.
58
+
59
+
Applying risk control to the previous example would allow us to get a new — albeit narrower — range of thresholds (blue zone on the graph) that are **statistically guaranteed**.
60
+
61
+
.. image:: images/example_with_risk_control.png
62
+
:width:600
63
+
:align:center
64
+
65
+
This guarantee is critical in a wide range of use cases, especially in high-stakes applications. Take, for example, medical diagnosis: here, the parameter :math:`\lambda` is the binarization threshold that determines whether a patient is classified as sick. We aim to minimize false negatives (i.e., cases where sick patients are incorrectly diagnosed as healthy), which corresponds to controlling the type II error. In this setting, risk control allows us to find a :math:`\lambda` such that, on future patients, the model’s type II error does not exceed, say, 5%, with high confidence.
10
66
11
67
—
12
68
13
-
Three methods for multi-label uncertainty quantification have been implemented in MAPIE so far :
14
-
Risk-Controlling Prediction Sets (RCPS) [1], Conformal Risk Control (CRC) [2] and Learn Then Test (LTT) [3].
15
-
The difference between these methods is the way the conformity scores are computed.
69
+
To express risk control in mathematical terms, we denote by R the risk we want to control, and introduce the following two parameters:
70
+
71
+
- :math:`\alpha`: the target level below which we want the risk to remain, as shown in the figure below;
72
+
73
+
.. image:: images/plot_alpha.png
74
+
:width:600
75
+
:align:center
76
+
77
+
- :math:`\delta`: the confidence level associated with the risk control.
78
+
79
+
In other words, the risk is said to be controlled if :math:`R \leq\alpha` with probability at least :math:`1 - \delta`.
80
+
81
+
Furthermore, there exist two types of risk control in terms of guarantees they give.
82
+
83
+
- Guarantee on the expectation of the risk: :math:`\mathbb{E}(R) \leq\alpha` → CRC;
16
84
17
-
For a multi-label classification problem in a standard independent and identically distributed (i.i.d) case,
85
+
- Guarantee on the probability that the risk does not exceed :math:`\alpha`: :math:`\mathbb{P}(R \leq\alpha) \geq1 - \delta` → RCPS/LTT.
86
+
87
+
.. image:: images/risk_distribution.png
88
+
:width:600
89
+
:align:center
90
+
91
+
The plot above gives a visual representation of the difference between the two types of guarantees:
92
+
93
+
- The risk is controlled in expectation (CRC) if the mean of its distribution over unseen data is below :math:`\alpha`;
94
+
95
+
- The risk is controlled in probability (RCPS/LTT) if at least :math:`1 - \delta` percent of its distribution over unseen data is below :math:`\alpha`.
96
+
97
+
For a classification problem in a standard independent and identically distributed (i.i.d) case,
18
98
our training data :math:`(X, Y) = \{(x_1, y_1), \ldots, (x_n, y_n)\}`` has an unknown distribution :math:`P_{X, Y}`.
19
99
20
-
For any risk level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user to construct a prediction
100
+
For any target level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user to construct a prediction
21
101
set :math:`\hat{C}_{n, \alpha}(X_{n+1})` for a new observation :math:`\left( X_{n+1},Y_{n+1} \right)` with a guarantee
22
-
on the recall. RCPS, LTT, and CRC give three slightly different guarantees:
102
+
on the specified risk. As mentioned above, RCPS, LTT, and CRC give three slightly different guarantees:
23
103
24
104
- RCPS:
25
105
@@ -37,14 +117,16 @@ on the recall. RCPS, LTT, and CRC give three slightly different guarantees:
0 commit comments