scikit-learn-contrib
diff --git a/‎AUTHORS.rst‎
Lines changed: 1 addition & 0 deletions b/‎AUTHORS.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.rst‎
Lines changed: 1 addition & 1 deletion b/‎README.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/theoretical_description_regression.rst‎
Lines changed: 7 additions & 7 deletions b/‎doc/theoretical_description_regression.rst‎
Lines changed: 7 additions & 7 deletions
diff --git a/‎examples/regression/1-quickstart/plot_compare_conformity_scores.py‎
Lines changed: 189 additions & 0 deletions b/‎examples/regression/1-quickstart/plot_compare_conformity_scores.py‎
Lines changed: 189 additions & 0 deletions
diff --git a/‎examples/regression/1-quickstart/plot_prefit_nn.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/regression/1-quickstart/plot_prefit_nn.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/regression/1-quickstart/plot_timeseries_example.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/regression/1-quickstart/plot_timeseries_example.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/regression/2-advanced-analysis/plot_nested-cv.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/regression/2-advanced-analysis/plot_nested-cv.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/regression/3-scientific-articles/plot_kim2020_simulations.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/regression/3-scientific-articles/plot_kim2020_simulations.py‎
Lines changed: 1 addition & 1 deletion
@@ -22,4 +22,5 @@ Contributors
 * Julien Roussel <[email protected]>
 * Vincent Blot <[email protected]>
 * Louis Lacombe <[email protected]>
+* Arnaud Capitaine <[email protected]>
 To be continued ...
@@ -167,7 +167,7 @@ The full documentation can be found `on this link <https://mapie.readthedocs.io/
 
 **How does MAPIE work on regression ?** It is basically based on cross-validation and relies on:
 
-- Residuals on the whole training set obtained by cross-validation,
+- Conformity scores on the whole training set obtained by cross-validation,
 - Perturbed models generated during the cross-validation.
 
 **MAPIE** then combines all these elements in a way that provides prediction intervals on new data with strong theoretical guarantees [1-2].
 
@@ -33,7 +33,7 @@ The so-called naive method computes the residuals of the training data to estima
 typical error obtained on a new test data point. 
 The prediction interval is therefore given by the prediction obtained by the 
 model trained on the entire training set :math:`\pm` the quantiles of the 
-residuals of the same training set:
+conformity scores of the same training set:
 
 .. math:: \hat{\mu}(X_{n+1}) \pm ((1-\alpha) \textrm{quantile of} |Y_1-\hat{\mu}(X_1)|, ..., |Y_n-\hat{\mu}(X_n)|)
 
@@ -43,7 +43,7 @@ or
 
 where :math:`\hat{q}_{n, \alpha}^+` is the :math:`(1-\alpha)` quantile of the distribution.
 
-Since this method estimates the residuals only on the training set, it tends to be too 
+Since this method estimates the conformity scores only on the training set, it tends to be too 
 optimistic and under-estimates the width of prediction intervals because of a potential overfit. 
 As a result, the probability that a new point lies in the interval given by the 
 naive method would be lower than the target level :math:`(1-\alpha)`.
@@ -65,11 +65,11 @@ Estimating the prediction intervals is carried out in three main steps:
   :math:`\hat{\mu}_{-i}` on the entire training set with the :math:`i^{th}` point removed,
   resulting in *n* leave-one-out models.
 
-- The corresponding leave-one-out residual is computed for each :math:`i^{th}` point
+- The corresponding leave-one-out conformity score is computed for each :math:`i^{th}` point
   :math:`|Y_i - \hat{\mu}_{-i}(X_i)|`.
 
 - We fit the regression function :math:`\hat{\mu}` on the entire training set and we compute
-  the prediction interval using the computed leave-one-out residuals:
+  the prediction interval using the computed leave-one-out conformity scores:
 
 .. math:: \hat{\mu}(X_{n+1}) \pm ((1-\alpha) \textrm{ quantile of } |Y_1-\hat{\mu}_{-1}(X_1)|, ..., |Y_n-\hat{\mu}_{-n}(X_n)|)
 
@@ -81,7 +81,7 @@ where
 
 .. math:: R_i^{\rm LOO} = |Y_i - \hat{\mu}_{-i}(X_i)|
 
-is the *leave-one-out* residual.
+is the *leave-one-out* conformity score.
 
 This method avoids the overfitting problem but can lose its predictive 
 cover when :math:`\hat{\mu}` becomes unstable, for example when the 
@@ -146,7 +146,7 @@ is performed in four main steps:
 - *K* regression functions :math:`\hat{\mu}_{-S_k}` are fitted on the training set with the 
   corresponding :math:`k^{th}` fold removed.
 
-- The corresponding *out-of-fold* residual is computed for each :math:`i^{th}` point 
+- The corresponding *out-of-fold* conformity score is computed for each :math:`i^{th}` point 
   :math:`|Y_i - \hat{\mu}_{-S_{k(i)}}(X_i)|` where *k(i)* is the fold containing *i*.
 
 - Similar to the jackknife+, the regression functions :math:`\hat{\mu}_{-S_{k(i)}}(X_i)` 
@@ -198,7 +198,7 @@ jackknife+-after-bootstrap is performed in four main steps:
 
 
 - These predictions are aggregated according to a given aggregation function 
-  :math:`{\rm agg}`, typically :math:`{\rm mean}` or :math:`{\rm median}`, and the residuals 
+  :math:`{\rm agg}`, typically :math:`{\rm mean}` or :math:`{\rm median}`, and the conformity scores 
   :math:`|Y_j - {\rm agg}(\hat{\mu}(B_{K(j)}(X_j)))|` are computed for each :math:`X_j`
   (with :math:`K(j)` the boostraps not containing :math:`X_j`).
 
 
@@ -0,0 +1,189 @@
+"""
+===========================================================
+Estimating prediction intervals of Gamma distributed target
+===========================================================
+This example uses :class:`mapie.regression.MapieRegressor` to estimate
+prediction intervals associated with Gamma distributed target.
+The limit of the absolute residual conformity score is illustrated.
+
+We use here the OpenML house_prices dataset:
+https://www.openml.org/search?type=data&sort=runs&id=42165&status=active.
+
+The data is modelled by a Random Forest model
+:class:`sklearn.ensemble.RandomForestRegressor` with a fixed parameter set.
+The prediction intervals are determined by means of the MAPIE regressor
+:class:`mapie.regression.MapieRegressor` considering two conformity scores:
+:class:`mapie.conformity_scores.AbsoluteConformityScore` which
+considers the absolute residuals as the conformity scores and
+:class:`mapie.conformity_scores.GammaConformityScore` which
+considers the residuals divided by the predicted means as conformity scores.
+We consider the standard CV+ resampling method.
+
+We would like to emphasize one main limitation with this example.
+With the default conformity score, the prediction intervals
+are approximately equal over the range of house prices which may
+be inapporpriate when the price range is wide. The Gamma conformity score
+overcomes this issue by considering prediction intervals with width
+proportional to the predicted mean. For low prices, the Gamma prediction
+intervals are narrower than the default ones, conversely to high prices
+for which the conficence intervals are higher but visually more relevant.
+The empirical coverage is similar between the two conformity scores.
+"""
+import matplotlib.pyplot as plt
+import numpy as np
+
+from sklearn.datasets import fetch_openml
+from sklearn.ensemble import RandomForestRegressor
+from sklearn.model_selection import train_test_split
+
+from mapie.conformity_scores import GammaConformityScore
+from mapie.metrics import regression_coverage_score
+from mapie.regression import MapieRegressor
+
+np.random.seed(0)
+
+# Parameters
+features = [
+    "MSSubClass",
+    "LotArea",
+    "OverallQual",
+    "OverallCond",
+    "GarageArea",
+]
+alpha = 0.05
+rf_kwargs = {"n_estimators": 10, "random_state": 0}
+model = RandomForestRegressor(**rf_kwargs)
+
+##############################################################################
+# 1. Load dataset with a target following approximativeley a Gamma distribution
+# -----------------------------------------------------------------------------
+#
+# We start by loading a dataset with a target following approximately
+# a Gamma distribution. The GammaConformityScore is relevant in such cases.
+# Two sub datasets are extracted: the training and test ones.
+
+X, y = fetch_openml(name="house_prices", return_X_y=True)
+
+X_train, X_test, y_train, y_test = train_test_split(
+    X[features], y, test_size=0.2
+)
+
+##############################################################################
+# 2. Train model with two conformity scores
+# -----------------------------------------
+#
+# Two models are trained with two different conformity score:
+#
+# - :class:mapie.conformity_scores.AbsoluteConformityScore (default conformity
+#   score) relevant for target positive as well as negative.
+#   The prediction interval widths are, in this case, approximately the same
+#   over the range of prediction.
+#
+# - :class:mapie.conformity_scores.GammaConformityScore relevant for target
+#   following roughly a Gamma distribution. The prediction interval widths
+#   scale with the predicted value.
+
+##############################################################################
+# First, train model with
+# :class:mapie.conformity_scores.AbsoluteConformityScore.
+mapie = MapieRegressor(model)
+mapie.fit(X_train, y_train)
+y_pred_absconfscore, y_pis_absconfscore = mapie.predict(X_test, alpha=alpha)
+
+coverage_absconfscore = regression_coverage_score(
+    y_test, y_pis_absconfscore[:, 0, 0], y_pis_absconfscore[:, 1, 0]
+)
+
+##############################################################################
+# Prepare the results for matplotlib. Get the prediction intervals and their
+# corresponding widths.
+
+
+def get_yerr(y_pred, y_pis):
+    return np.concatenate(
+        [
+            np.expand_dims(y_pred, 0) - y_pis[:, 0, 0].T,
+            y_pis[:, 1, 0].T - np.expand_dims(y_pred, 0),
+        ],
+        axis=0,
+    )
+
+
+yerr_absconfscore = get_yerr(y_pred_absconfscore, y_pis_absconfscore)
+pred_int_width_absconfscore = (
+    y_pis_absconfscore[:, 1, 0] - y_pis_absconfscore[:, 0, 0]
+)
+
+##############################################################################
+# Then, train the model with
+# :class:mapie.conformity_scores.GammaConformityScore.
+mapie = MapieRegressor(model, conformity_score=GammaConformityScore())
+mapie.fit(X_train, y_train)
+y_pred_gammaconfscore, y_pis_gammaconfscore = mapie.predict(
+    X_test, alpha=[alpha]
+)
+
+coverage_gammaconfscore = regression_coverage_score(
+    y_test, y_pis_gammaconfscore[:, 0, 0], y_pis_gammaconfscore[:, 1, 0]
+)
+
+yerr_gammaconfscore = get_yerr(y_pred_gammaconfscore, y_pis_gammaconfscore)
+pred_int_width_gammaconfscore = (
+    y_pis_gammaconfscore[:, 1, 0] - y_pis_gammaconfscore[:, 0, 0]
+)
+
+
+##############################################################################
+# 3. Compare the prediction intervals
+# -----------------------------------
+#
+# Once the models have been trained, we now compare the prediction intervals
+# obtained from the two conformity scores. We can see that the
+# :class:AbsoluteConformityScore generates prediction interval with almost the
+# same width for all the predicted values. Converly, the GammaConformityScore
+# yields prediction interval with width scaling with the predicted values.
+#
+# The choice of the conformity score depends on the problem we face.
+
+fig, axs = plt.subplots(2, 2, figsize=(10, 10))
+
+for img_id, y_pred, y_err, cov, class_name, int_width in zip(
+    [0, 1],
+    [y_pred_absconfscore, y_pred_gammaconfscore],
+    [yerr_absconfscore, yerr_gammaconfscore],
+    [coverage_absconfscore, coverage_gammaconfscore],
+    ["AbsoluteResidualScore", "GammaResidualScore"],
+    [pred_int_width_absconfscore, pred_int_width_gammaconfscore],
+):
+    axs[0, img_id].errorbar(
+        y_test,
+        y_pred,
+        yerr=y_err,
+        alpha=0.5,
+        linestyle="None",
+    )
+    axs[0, img_id].scatter(y_test, y_pred, s=1, color="black")
+    axs[0, img_id].plot(
+        [0, max(max(y_test), max(y_pred))],
+        [0, max(max(y_test), max(y_pred))],
+        "-r",
+    )
+    axs[0, img_id].set_xlabel("Actual price [$]")
+    axs[0, img_id].set_ylabel("Predicted price [$]")
+    axs[0, img_id].grid()
+    axs[0, img_id].set_title(f"{class_name} - coverage={cov:.0%}")
+
+    xmin, xmax = axs[0, img_id].get_xlim()
+    ymin, ymax = axs[0, img_id].get_ylim()
+    axs[1, img_id].scatter(y_test, int_width, marker="+")
+    axs[1, img_id].set_xlabel("Actual price [$]")
+    axs[1, img_id].set_ylabel("Prediction interval width [$]")
+    axs[1, img_id].grid()
+    axs[1, img_id].set_xlim([xmin, xmax])
+    axs[1, img_id].set_ylim([ymin, ymax])
+
+fig.suptitle(
+    f"Predicted values with the prediction intervals of level {alpha}"
+)
+plt.subplots_adjust(wspace=0.3, hspace=0.3)
+plt.show()
@@ -24,7 +24,7 @@
 
 def f(x: NDArray) -> NDArray:
     """Polynomial function used to generate one-dimensional data."""
-    return np.array(5 * x + 5 * x ** 4 - 9 * x ** 2)
+    return np.array(5 * x + 5 * x**4 - 9 * x**2)
 
 
 # Generate data
 
@@ -4,7 +4,7 @@
 =======================================================
 This example uses :class:`mapie.regression.MapieRegressor` to estimate
 prediction intervals associated with time series forecast. We use the
-standard cross-validation approach to estimate residuals and associated
+standard cross-validation approach to estimate conformity scores and associated
 prediction intervals.
 
 We use here the Victoria electricity demand dataset used in the book
@@ -37,7 +37,7 @@
 
 from mapie.metrics import (
     regression_coverage_score,
-    regression_mean_width_score
+    regression_mean_width_score,
 )
 from mapie.regression import MapieRegressor
 
 
@@ -13,8 +13,8 @@
 A limitation of this method is that residuals used by MAPIE are computed on
 the validation dataset, which can be subject to overfitting as far as
 hyperparameter tuning is concerned.
-This fools MAPIE into being slightly too optimistic with confidence intervals.
 
+This fools MAPIE into being slightly too optimistic with confidence intervals.
 To solve this problem, an alternative option is to perform a nested
 cross-validation parameter search directly within the MAPIE estimator on each
 *out-of-fold* dataset.
@@ -39,7 +39,7 @@
 effective coverages.
 
 In the general case, the recommended approach is to use nested
-cross-validation, since it does not underestimate residuals and hence
+cross-validation, since it does not underestimate conformity scores and hence
 prediction intervals. However, in this particular example, effective
 coverages of both nested and non-nested methods are the same.
 """
 
@@ -161,7 +161,7 @@ def compute_PIs(
     method : str
         Method for estimating prediction intervals.
     cv : Any
-        Strategy for computing residuals.
+        Strategy for computing conformity scores.
     alpha : float
         1 - (target coverage level).
     agg_function: str