@@ -48,7 +48,7 @@ optimistic and under-estimates the width of prediction intervals because of a po
4848As a result, the probability that a new point lies in the interval given by the
4949naive method would be lower than the target level :math: `(1 -\alpha )`.
5050
51- The figure below illustrates the Naive method.
51+ The figure below illustrates the naive method.
5252
5353.. image :: images/jackknife_naive.png
5454 :width: 200
@@ -237,8 +237,8 @@ residuals of the estimator fitted on the calibration set. Note that in the symme
237237As justified by [3], this method offers a theoretical guarantee of the target coverage
238238level :math: `1 -\alpha `.
239239
240- Note that this means that using the split method will require to run three separate regressions
241- to estimate the prediction intervals .
240+ Note that only the split method has been implemented and that it will run three separate
241+ regressions when using :class: ` mapie.quantile_regression.MapieQuantileRegressor ` .
242242
243243
2442449. The ensemble batch prediction intervals (EnbPI) method
@@ -261,7 +261,7 @@ However the confidence intervals are like those of the jackknife method.
261261 where :math: `\hat {\mu }_{agg}(X_{n+1 })` is the aggregation of the predictions of
262262the LOO estimators (mean or median), and
263263:math: `R_i^{\rm LOO} = |Y_i - \hat {\mu }_{-i}(X_{i})|`
264- is the residual of the LOO estimator :math: `\hat {\mu }_{-i}` at :math: `X_{i}`.
264+ is the residual of the LOO estimator :math: `\hat {\mu }_{-i}` at :math: `X_{i}` [4] .
265265
266266The residuals are no longer considered in absolute values but in relative
267267values and the width of the confidence intervals are minimized, up to a given gap
@@ -277,7 +277,7 @@ hypotheses:
2772771. Errors are short-term independent and identically distributed (i.i.d)
278278
2792792. Estimation quality: there exists a real sequence :math: `(\delta _T)_{T > 0 }`
280- that converges to zero such that
280+ that converges to zero such that
281281
282282.. math ::
283283 \frac {1 }{T}\sum _1 ^T(\hat {\mu }_{-t}(x_t) - \mu (x_t))^2 < \delta _T^2
@@ -288,8 +288,8 @@ The coverage level depends on the size of the training set and on
288288Be careful: the bigger the training set, the better the covering guarantee
289289for the point following the training set. However, if the residuals are
290290updated gradually, but the model is not refitted, the bigger the training set
291- is, the slower the update of the residuals is effective. Therefore there is a
292- compromise to take on the number of training samples to fit the model and
291+ is, the slower the update of the residuals is effective. Therefore there is a
292+ compromise to make on the number of training samples to fit the model and
293293update the prediction intervals.
294294
295295
@@ -318,6 +318,9 @@ Key takeaways
318318 theoretical and practical coverages due to the larger widths of the prediction intervals.
319319 It is therefore advised to use them when conservative estimates are needed.
320320
321+ - The conformalized quantile regression method allows for more adaptiveness on the prediction
322+ intervals which becomes key when faced with heteroscedastic data.
323+
321324- If the "exchangeability hypothesis" is not valid, typically for time series,
322325 use EnbPI, and update the residuals each time new observations are available.
323326
@@ -345,6 +348,6 @@ References
345348[3] Yaniv Romano, Evan Patterson, Emmanuel J. Candès.
346349"Conformalized Quantile Regression." Advances in neural information processing systems 32 (2019).
347350
348- [7 ] Chen Xu and Yao Xie.
351+ [4 ] Chen Xu and Yao Xie.
349352"Conformal Prediction Interval for Dynamic Time-Series."
350353International Conference on Machine Learning (ICML, 2021).
0 commit comments