@@ -10,12 +10,28 @@ Theoretical Description
1010Three methods for multi-class uncertainty-quantification have been implemented in MAPIE so far :
1111LABEL [1], Adaptive Prediction Sets [2, 3] and Top-K [3].
1212The difference between these methods is the way the conformity scores are computed.
13- The figure below illustrates the three methods implmented in MAPIE:
13+ The figure below illustrates the three methods implemented in MAPIE:
1414
1515.. image :: images/classification_methods.png
1616 :width: 600
1717 :align: center
1818
19+ For a classification problem in a standard independent and identically distributed (i.i.d) case,
20+ our training data :math: `(X, Y) = \{(x_1 , y_1 ), \ldots , (x_n, y_n)\}`` has an unknown distribution :math: `P_{X, Y}`.
21+
22+ For any risk level :math: `\alpha ` between 0 and 1, the methods implemented in MAPIE allow the user construct a prediction
23+ set :math: `\hat {C}_{n, \alpha }(X_{n+1 })` for a new observation :math: `\left ( X_{n+1 },Y_{n+1 } \right )` with a guarantee
24+ on the marginal coverage such that :
25+
26+ .. math ::
27+ P \{ Y_{n+1 } \in \hat {C}_{n, \alpha }(X_{n+1 }) \} \geq 1 - \alpha
28+
29+
30+ In words, for a typical risk level $\a lpha$ of $10 \% $, we want to construct prediction sets that contain the true observations
31+ for at least $90 \% $ of the new test data points.
32+ Note that the guarantee is possible only on the marginal coverage, and not on the conditional coverage
33+ :math: `P \{ Y_{n+1 } \in \hat {C}_{n, \alpha }(X_{n+1 }) | X_{n+1 } = x_{n+1 } \}` which depends on the location of the new test point in the distribution.
34+
19351. LABEL
2036--------
2137
@@ -37,7 +53,7 @@ Finally, we construct a prediction set by including all labels with a score high
3753 \hat {C}(X_{test}) = \{ y : \hat {\mu }(X_{test})_y \geq 1 - \hat {q}\}
3854
3955
40- This simple approach allows us to construct prediction sets coming with a theoretical guarantee on the marginal coverage.
56+ This simple approach allows us to construct prediction sets which have a theoretical guarantee on the marginal coverage.
4157However, although this method generally results in small prediction sets, it tends to produce empty ones when the model is uncertain,
4258for example at the border between two classes.
4359
@@ -54,7 +70,7 @@ label of the observation :
5470 s_i(X_i, Y_i) = \sum ^k_{j=1 } \hat {\mu }(X_i)_{\pi _j} \quad \text {where} \quad Y_i = \pi _j
5571
5672
57- The quantile :math: `\hat {q}` is then computed the same way as the score method.
73+ The quantile :math: `\hat {q}` is then computed the same way as the LABEL method.
5874For the construction of the prediction sets for a new test point, the same procedure of ranked summing is applied until reaching the quantile,
5975as described in the following equation :
6076
@@ -86,6 +102,7 @@ The prediction sets are build by taking the :math:`\hat{q}^{th}` higher scores.
86102.. math ::
87103 \hat {C}(X_{test}) = \{\pi _1 , ..., \pi _{\hat {q}}\}
88104
105+ As with other methods, this procedure allows the user to build prediction sets with guarantees on the marginal coverage.
89106
90107
911084. Split- and cross-conformal methods
@@ -168,8 +185,8 @@ where :
168185
169186.. TO BE CONTINUED
170187
171- References
172- ==========
188+ 5. References
189+ -------------
173190
174191[1] Mauricio Sadinle, Jing Lei, & Larry Wasserman.
175192"Least Ambiguous Set-Valued Classifiers With Bounded Error Levels."
0 commit comments