Skip to content

Commit 1e6a81f

Browse files
StefanieSengervirchanthomasjpfan
authored
DOC fix link in HuberRegressor docstring (scikit-learn#30417)
Co-authored-by: Virgil Chan <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]>
1 parent f77ff4e commit 1e6a81f

File tree

6 files changed

+21
-21
lines changed

6 files changed

+21
-21
lines changed

doc/modules/linear_model.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1585,10 +1585,10 @@ better than an ordinary least squares in high dimension.
15851585
Huber Regression
15861586
----------------
15871587

1588-
The :class:`HuberRegressor` is different to :class:`Ridge` because it applies a
1589-
linear loss to samples that are classified as outliers.
1588+
The :class:`HuberRegressor` is different from :class:`Ridge` because it applies a
1589+
linear loss to samples that are defined as outliers by the `epsilon` parameter.
15901590
A sample is classified as an inlier if the absolute error of that sample is
1591-
lesser than a certain threshold. It differs from :class:`TheilSenRegressor`
1591+
lesser than the threshold `epsilon`. It differs from :class:`TheilSenRegressor`
15921592
and :class:`RANSACRegressor` because it does not ignore the effect of the outliers
15931593
but gives a lesser weight to them.
15941594

@@ -1603,13 +1603,13 @@ but gives a lesser weight to them.
16031603

16041604
.. dropdown:: Mathematical details
16051605

1606-
The loss function that :class:`HuberRegressor` minimizes is given by
1606+
:class:`HuberRegressor` minimizes
16071607

16081608
.. math::
16091609
16101610
\min_{w, \sigma} {\sum_{i=1}^n\left(\sigma + H_{\epsilon}\left(\frac{X_{i}w - y_{i}}{\sigma}\right)\sigma\right) + \alpha {||w||_2}^2}
16111611
1612-
where
1612+
where the loss function is given by
16131613

16141614
.. math::
16151615
@@ -1624,7 +1624,7 @@ but gives a lesser weight to them.
16241624
.. rubric:: References
16251625

16261626
* Peter J. Huber, Elvezio M. Ronchetti: Robust Statistics, Concomitant scale
1627-
estimates, pg 172
1627+
estimates, p. 172.
16281628

16291629
The :class:`HuberRegressor` differs from using :class:`SGDRegressor` with loss set to `huber`
16301630
in the following ways.
@@ -1638,10 +1638,10 @@ in the following ways.
16381638
samples while :class:`SGDRegressor` needs a number of passes on the training data to
16391639
produce the same robustness.
16401640

1641-
Note that this estimator is different from the R implementation of Robust Regression
1642-
(https://stats.oarc.ucla.edu/r/dae/robust-regression/) because the R implementation does a weighted least
1643-
squares implementation with weights given to each sample on the basis of how much the residual is
1644-
greater than a certain threshold.
1641+
Note that this estimator is different from the `R implementation of Robust
1642+
Regression <https://stats.oarc.ucla.edu/r/dae/robust-regression/>`_ because the R
1643+
implementation does a weighted least squares implementation with weights given to each
1644+
sample on the basis of how much the residual is greater than a certain threshold.
16451645

16461646
.. _quantile_regression:
16471647

doc/modules/model_evaluation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2543,7 +2543,7 @@ Here is a small example of usage of the :func:`mean_absolute_error` function::
25432543
Mean squared error
25442544
-------------------
25452545

2546-
The :func:`mean_squared_error` function computes `mean square
2546+
The :func:`mean_squared_error` function computes `mean squared
25472547
error <https://en.wikipedia.org/wiki/Mean_squared_error>`_, a risk
25482548
metric corresponding to the expected value of the squared (quadratic) error or
25492549
loss.

examples/linear_model/plot_robust_fit.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
Here a sine function is fit with a polynomial of order 3, for values
66
close to zero.
77
8-
Robust fitting is demoed in different situations:
8+
Robust fitting is demonstrated in different situations:
99
1010
- No measurement errors, only modelling errors (fitting a sine with a
1111
polynomial)

examples/model_selection/plot_roc.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@
159159
# %%
160160
# In a multi-class classification setup with highly imbalanced classes,
161161
# micro-averaging is preferable over macro-averaging. In such cases, one can
162-
# alternatively use a weighted macro-averaging, not demoed here.
162+
# alternatively use a weighted macro-averaging, not demonstrated here.
163163

164164
display = RocCurveDisplay.from_predictions(
165165
y_onehot_test.ravel(),

examples/preprocessing/plot_scaling_importance.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@
1212
algorithms require features to be normalized, often for different reasons: to
1313
ease the convergence (such as a non-penalized logistic regression), to create a
1414
completely different model fit compared to the fit with unscaled data (such as
15-
KNeighbors models). The latter is demoed on the first part of the present
15+
KNeighbors models). The latter is demonstrated on the first part of the present
1616
example.
1717
1818
On the second part of the example we show how Principal Component Analysis (PCA)
1919
is impacted by normalization of features. To illustrate this, we compare the
2020
principal components found using :class:`~sklearn.decomposition.PCA` on unscaled
21-
data with those obatined when using a
21+
data with those obtained when using a
2222
:class:`~sklearn.preprocessing.StandardScaler` to scale data first.
2323
2424
In the last part of the example we show the effect of the normalization on the

sklearn/linear_model/_huber.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -132,10 +132,10 @@ class HuberRegressor(LinearModel, RegressorMixin, BaseEstimator):
132132
``|(y - Xw - c) / sigma| < epsilon`` and the absolute loss for the samples
133133
where ``|(y - Xw - c) / sigma| > epsilon``, where the model coefficients
134134
``w``, the intercept ``c`` and the scale ``sigma`` are parameters
135-
to be optimized. The parameter sigma makes sure that if y is scaled up
136-
or down by a certain factor, one does not need to rescale epsilon to
135+
to be optimized. The parameter `sigma` makes sure that if `y` is scaled up
136+
or down by a certain factor, one does not need to rescale `epsilon` to
137137
achieve the same robustness. Note that this does not take into account
138-
the fact that the different features of X may be of different scales.
138+
the fact that the different features of `X` may be of different scales.
139139
140140
The Huber loss function has the advantage of not being heavily influenced
141141
by the outliers while not completely ignoring their effect.
@@ -219,9 +219,9 @@ class HuberRegressor(LinearModel, RegressorMixin, BaseEstimator):
219219
References
220220
----------
221221
.. [1] Peter J. Huber, Elvezio M. Ronchetti, Robust Statistics
222-
Concomitant scale estimates, pg 172
223-
.. [2] Art B. Owen (2006), A robust hybrid of lasso and ridge regression.
224-
https://statweb.stanford.edu/~owen/reports/hhu.pdf
222+
Concomitant scale estimates, p. 172
223+
.. [2] Art B. Owen (2006), `A robust hybrid of lasso and ridge regression.
224+
<https://artowen.su.domains/reports/hhu.pdf>`_
225225
226226
Examples
227227
--------

0 commit comments

Comments
 (0)