Skip to content

Commit 2b16e09

Browse files
authored
Merge pull request #187 from scikit-learn-contrib/fix-typos-in-doc
Fix typos in doc
2 parents 4c1caf0 + 1efd52e commit 2b16e09

File tree

5 files changed

+21
-14
lines changed

5 files changed

+21
-14
lines changed

HISTORY.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,17 @@ History
44

55
0.4.0 (2022-06-24)
66
------------------
7+
78
* Relax and fix typing
89
* Add Split Conformal Quantile Regression
910
* Add EnbPI method for Time Series Regression
1011
* Add EnbPI Documentation
12+
* Add example with heteroscedastic data
13+
* Add `ConformityScore` class that allows the user to define custom conformity scores
1114

1215
0.3.2 (2022-03-11)
1316
------------------
17+
1418
* Refactorize unit tests
1519
* Add "naive" and "top-k" methods in MapieClassifier
1620
* Include J+aB method in regression tutorial

README.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ single-output regression or multi-class classification settings.
4444

4545
Prediction intervals output by **MAPIE** encompass both aleatoric and epistemic
4646
uncertainties and are backed by strong theoretical guarantees thanks to conformal
47-
prediction methods [1-5].
47+
prediction methods [1-7].
4848

4949

5050
🔗 Requirements
@@ -76,7 +76,7 @@ To install directly from the github repository :
7676

7777
.. code:: sh
7878
79-
$ pip install git+https://github.com/simai-ml/MAPIE
79+
$ pip install git+https://github.com/scikit-learn-contrib/MAPIE
8080
8181
8282
⚡️ Quickstart

doc/quick_start.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ To install directly from the github repository :
2929

3030
.. code:: python
3131
32-
pip install git+https://github.com/simai-ml/MAPIE
32+
pip install git+https://github.com/scikit-learn-contrib/MAPIE
3333
3434
3535
2. Run MapieRegressor

doc/theoretical_description_classification.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,16 +19,16 @@ The figure below illustrates the three methods implemented in MAPIE:
1919
For a classification problem in a standard independent and identically distributed (i.i.d) case,
2020
our training data :math:`(X, Y) = \{(x_1, y_1), \ldots, (x_n, y_n)\}`` has an unknown distribution :math:`P_{X, Y}`.
2121

22-
For any risk level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user construct a prediction
22+
For any risk level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user to construct a prediction
2323
set :math:`\hat{C}_{n, \alpha}(X_{n+1})` for a new observation :math:`\left( X_{n+1},Y_{n+1} \right)` with a guarantee
2424
on the marginal coverage such that :
2525

2626
.. math::
2727
P \{Y_{n+1} \in \hat{C}_{n, \alpha}(X_{n+1}) \} \geq 1 - \alpha
2828
2929
30-
In words, for a typical risk level $\alpha$ of $10 \%$, we want to construct prediction sets that contain the true observations
31-
for at least $90 \%$ of the new test data points.
30+
In words, for a typical risk level :math:`\alpha` of :math:`10 \%`, we want to construct prediction sets that contain the true observations
31+
for at least :math:`90 \%` of the new test data points.
3232
Note that the guarantee is possible only on the marginal coverage, and not on the conditional coverage
3333
:math:`P \{Y_{n+1} \in \hat{C}_{n, \alpha}(X_{n+1}) | X_{n+1} = x_{n+1} \}` which depends on the location of the new test point in the distribution.
3434

doc/theoretical_description_regression.rst

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ optimistic and under-estimates the width of prediction intervals because of a po
4848
As a result, the probability that a new point lies in the interval given by the
4949
naive method would be lower than the target level :math:`(1-\alpha)`.
5050

51-
The figure below illustrates the Naive method.
51+
The figure below illustrates the naive method.
5252

5353
.. image:: images/jackknife_naive.png
5454
:width: 200
@@ -237,8 +237,8 @@ residuals of the estimator fitted on the calibration set. Note that in the symme
237237
As justified by [3], this method offers a theoretical guarantee of the target coverage
238238
level :math:`1-\alpha`.
239239

240-
Note that this means that using the split method will require to run three separate regressions
241-
to estimate the prediction intervals.
240+
Note that only the split method has been implemented and that it will run three separate
241+
regressions when using :class:`mapie.quantile_regression.MapieQuantileRegressor`.
242242

243243

244244
9. The ensemble batch prediction intervals (EnbPI) method
@@ -261,7 +261,7 @@ However the confidence intervals are like those of the jackknife method.
261261
where :math:`\hat{\mu}_{agg}(X_{n+1})` is the aggregation of the predictions of
262262
the LOO estimators (mean or median), and
263263
:math:`R_i^{\rm LOO} = |Y_i - \hat{\mu}_{-i}(X_{i})|`
264-
is the residual of the LOO estimator :math:`\hat{\mu}_{-i}` at :math:`X_{i}`.
264+
is the residual of the LOO estimator :math:`\hat{\mu}_{-i}` at :math:`X_{i}` [4].
265265

266266
The residuals are no longer considered in absolute values but in relative
267267
values and the width of the confidence intervals are minimized, up to a given gap
@@ -277,7 +277,7 @@ hypotheses:
277277
1. Errors are short-term independent and identically distributed (i.i.d)
278278

279279
2. Estimation quality: there exists a real sequence :math:`(\delta_T)_{T > 0}`
280-
that converges to zero such that
280+
that converges to zero such that
281281

282282
.. math::
283283
\frac{1}{T}\sum_1^T(\hat{\mu}_{-t}(x_t) - \mu(x_t))^2 < \delta_T^2
@@ -288,8 +288,8 @@ The coverage level depends on the size of the training set and on
288288
Be careful: the bigger the training set, the better the covering guarantee
289289
for the point following the training set. However, if the residuals are
290290
updated gradually, but the model is not refitted, the bigger the training set
291-
is, the slower the update of the residuals is effective. Therefore there is a
292-
compromise to take on the number of training samples to fit the model and
291+
is, the slower the update of the residuals is effective. Therefore there is a
292+
compromise to make on the number of training samples to fit the model and
293293
update the prediction intervals.
294294

295295

@@ -318,6 +318,9 @@ Key takeaways
318318
theoretical and practical coverages due to the larger widths of the prediction intervals.
319319
It is therefore advised to use them when conservative estimates are needed.
320320

321+
- The conformalized quantile regression method allows for more adaptiveness on the prediction
322+
intervals which becomes key when faced with heteroscedastic data.
323+
321324
- If the "exchangeability hypothesis" is not valid, typically for time series,
322325
use EnbPI, and update the residuals each time new observations are available.
323326

@@ -345,6 +348,6 @@ References
345348
[3] Yaniv Romano, Evan Patterson, Emmanuel J. Candès.
346349
"Conformalized Quantile Regression." Advances in neural information processing systems 32 (2019).
347350

348-
[7] Chen Xu and Yao Xie.
351+
[4] Chen Xu and Yao Xie.
349352
"Conformal Prediction Interval for Dynamic Time-Series."
350353
International Conference on Machine Learning (ICML, 2021).

0 commit comments

Comments
 (0)