|
1 | | -Getting started |
| 1 | +Getting Started |
2 | 2 | =============== |
3 | 3 |
|
4 | 4 | This page provides a starter example to introduce users to the ``rehline`` package and showcase its primary features, facilitating exploration and familiarization. |
5 | 5 |
|
6 | | -To proceed, make sure that you have already installed ``rehline``: |
| 6 | +To proceed, ensure that you have already installed ``rehline``: |
7 | 7 |
|
8 | 8 | .. code:: bash |
9 | 9 |
|
10 | | - pip install rehline |
| 10 | + pip install rehline |
11 | 11 |
|
12 | 12 | -------------------------------- |
13 | 13 |
|
14 | | -``rehline`` is a generic solver for flexible machine learning Empirical Risk Minimization (ERM), particularly suited for formulations with *non-smooth* objectives. |
| 14 | +``rehline`` is a versatile solver for machine learning problems, particularly effective for Empirical Risk Minimization (ERM) with `non-smooth` objectives. We will use ERM as our starting example to demonstrate that: |
15 | 15 |
|
| 16 | +.. admonition:: Note |
| 17 | + :class: tip |
16 | 18 |
|
17 | | -Let's start first by generating a toy dataset and splitting it to train and test sets. For that, we will use scikit-learn make_regression |
| 19 | + With ``rehline``, you can easily transform different `loss functions` and add `constraints` to your ERM with no tears! |
| 20 | + |
| 21 | +Let's begin by generating a toy dataset and splitting it into training and test sets using scikit-learn's `make_regression`. |
18 | 22 |
|
19 | 23 | .. code:: python |
20 | 24 |
|
21 | | - # imports |
| 25 | + # Import necessary libraries |
| 26 | + import numpy as np |
22 | 27 | from sklearn.datasets import make_regression |
23 | 28 | from sklearn.model_selection import train_test_split |
| 29 | + from sklearn.preprocessing import StandardScaler |
| 30 | + |
| 31 | + np.random.seed(1024) |
| 32 | + # Generate toy data |
| 33 | + n, d = 1000, 5 |
| 34 | + scaler = StandardScaler() |
| 35 | + X, y = make_regression(n_samples=n, n_features=d, noise=1.0) |
| 36 | + # Normalize X and add intercept |
| 37 | + X = scaler.fit_transform(X) |
| 38 | + X = np.hstack((X, np.ones((n, 1)))) |
| 39 | + |
| 40 | + # Split data into training and test sets |
| 41 | + X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=50) |
| 42 | +
|
| 43 | +Quantile Regression |
| 44 | +------------------- |
| 45 | + |
| 46 | +Next, let's use ``rehline`` to fit a quantile regression (QR) at quantile level 0.95 (:math:`\kappa=0.95`). |
| 47 | + |
| 48 | +The ridge-regularized QR solves the following optimization problem: |
24 | 49 |
|
25 | | - # generate toy data |
26 | | - X, y = make_regression(n_samples=100, n_features=1000) |
| 50 | +.. math:: |
27 | 51 |
|
28 | | - # split data |
29 | | - X_train, X_test, y_train, y_test = train_test_split(X, y) |
| 52 | + \min_{\beta \in \mathbb{R}^{d}} \ C \sum_{i=1}^n \rho_\kappa ( y_i - x_i^\intercal \beta ) + \frac{1}{2} \| \beta \|^2, |
30 | 53 |
|
31 | | -Then let's use ``rehline`` to fit a **quantile regression** at quantile level 0.75. |
| 54 | +where :math:`\rho_\kappa(u) = u \cdot (\kappa - \mathbf{1}(u < 0))` is the `check loss`, :math:`x_i \in \mathbb{R}^d` is a feature vector, and :math:`y_i \in \mathbb{R}` is the response variable. |
| 55 | + |
| 56 | +Since the `check loss` is a piecewise linear quadratic function (PLQ), it can be solved using ``rehline.plqERM_Ridge``: |
32 | 57 |
|
33 | 58 | .. code:: python |
34 | 59 |
|
35 | | - # imports |
36 | | - from sklearn.datasets import make_regression |
37 | | - from sklearn.model_selection import train_test_split |
| 60 | + from rehline import plqERM_Ridge |
| 61 | + # Define a QR estimator |
| 62 | + clf = plqERM_Ridge(loss={'name': 'QR', 'qt': 0.95}, C=1.0) |
| 63 | + clf.fit(X=X_train, y=y_train) |
| 64 | + # Make predictions |
| 65 | + q_predict = clf.decision_function(X_test) |
| 66 | +
|
| 67 | + # Plot results |
| 68 | + import matplotlib.pyplot as plt |
| 69 | + plt.scatter(x=X_test[:, 0], y=y_test, label='y_true') |
| 70 | + plt.scatter(x=X_test[:, 0], y=q_predict, alpha=0.5, label='q_95') |
| 71 | + plt.legend(loc="upper left") |
| 72 | + plt.show() |
| 73 | +
|
| 74 | +Huber Regression |
| 75 | +---------------- |
| 76 | + |
| 77 | +If you prefer Huber regression, it is also a PLQ function. |
| 78 | + |
| 79 | +The ridge-regularized Huber minimization solves the following optimization problem: |
| 80 | + |
| 81 | +.. math:: |
| 82 | +
|
| 83 | + \min_{\mathbf{\beta}} C \sum_{i=1}^n H_\kappa( y_i - \mathbf{x}_i^\intercal \mathbf{\beta} ) + \frac{1}{2} \| \mathbf{\beta} \|_2^2, |
38 | 84 |
|
39 | | - # generate toy data |
40 | | - X, y = make_regression(n_samples=100, n_features=1000) |
| 85 | +where :math:`H_\kappa(\cdot)` is the Huber loss defined as follows: |
| 86 | + |
| 87 | +.. math:: |
| 88 | + \begin{equation*} |
| 89 | + H_\kappa(z) = |
| 90 | + \begin{cases} |
| 91 | + z^2/2, & 0 < |z| \leq \kappa, \\ |
| 92 | + \kappa ( |z| - \kappa/2 ), & |z| > \kappa. |
| 93 | + \end{cases} |
| 94 | + \end{equation*} |
| 95 | +
|
| 96 | +.. code:: python |
| 97 | +
|
| 98 | + from rehline import plqERM_Ridge |
| 99 | + # Define a Huber estimator |
| 100 | + clf = plqERM_Ridge(loss={'name': 'huber', 'tau': 0.5}, C=1.0) |
| 101 | + clf.fit(X=X_train, y=y_train) |
| 102 | + # Make predictions |
| 103 | + y_huber = clf.decision_function(X_test) |
| 104 | +
|
| 105 | + # Plot results |
| 106 | + import matplotlib.pyplot as plt |
| 107 | + plt.scatter(x=X_test[:, 0], y=y_test, label='y_true') |
| 108 | + plt.scatter(x=X_test[:, 0], y=y_huber, alpha=0.5, label='y_huber') |
| 109 | + plt.legend(loc="upper left") |
| 110 | + plt.show() |
| 111 | +
|
| 112 | +Fairness Constraints |
| 113 | +-------------------- |
| 114 | + |
| 115 | +You have now learned that the fitted Huber regression requires a fairness constraint for the first feature :math:`\mathbf{X}_{1}`. Specifically, the correlation between the predicted :math:`\hat{Y}` and :math:`\mathbf{X}_{1}` must be less than `tol=0.1`, that is, |
| 116 | + |
| 117 | +.. math:: |
| 118 | +
|
| 119 | + \min_{\mathbf{\beta}} C \sum_{i=1}^n H_\kappa( y_i - \mathbf{x}_i^\intercal \mathbf{\beta} ) + \frac{1}{2} \| \mathbf{\beta} \|_2^2, \quad \text{s.t.} \quad \Big | \frac{1}{n} \sum_{i=1}^n \mathbf{z}_i \mathbf{\beta}^\intercal \mathbf{x}_i \Big| \leq \mathbf{\rho} |
| 120 | +
|
| 121 | +With `rehline`, you can easily add a `fairness constraint` to your ERM. |
| 122 | + |
| 123 | +.. code:: python |
41 | 124 |
|
42 | | - # split data |
43 | | - X_train, X_test, y_train, y_test = train_test_split(X, y) |
| 125 | + from rehline import plqERM_Ridge |
| 126 | + from scipy.stats import pearsonr |
| 127 | + # Define a Huber estimator with fairness constraint |
| 128 | + clf = plqERM_Ridge(loss={'name': 'huber', 'tau': 0.5}, |
| 129 | + constraint=[{'name': 'fair', 'X_sen': X_train[:, 0], 'tol_sen': 0.1}], |
| 130 | + C=1.0, |
| 131 | + max_iter=10000) |
| 132 | + clf.fit(X=X_train, y=y_train) |
| 133 | + # Make predictions |
| 134 | + y_huber_fair = clf.decision_function(X_test) |
| 135 | +
|
| 136 | + # Plot results |
| 137 | + import matplotlib.pyplot as plt |
| 138 | + plt.scatter(x=X_test[:, 0], y=y_test, label='y_true') |
| 139 | + plt.scatter(x=X_test[:, 0], y=y_huber, alpha=0.5, label='y_huber') |
| 140 | + plt.scatter(x=X_test[:, 0], y=y_huber_fair, alpha=0.5, label='y_huber_fair') |
| 141 | + plt.legend(loc="upper left") |
| 142 | + plt.show() |
| 143 | +
|
| 144 | +.. nblinkgallery:: |
| 145 | + :caption: Related Examples |
| 146 | + :name: rst-link-gallery |
| 147 | + |
| 148 | + examples/QR.ipynb |
0 commit comments