Skip to content

Conversation

@PascalCarrivain
Copy link
Contributor

Context of the PR

This PR adds support for intercept in SqrtLasso estimator.
Closes #96

Checks before merging PR

  • added documentation for any new feature
  • added unittests
  • edited the what's new

@PascalCarrivain PascalCarrivain changed the title fix add support for intercept in SqrtLasso ENH - add support for intercept in SqrtLasso Dec 14, 2023
Copy link
Collaborator

@Badr-MOUFAD Badr-MOUFAD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PascalCarrivain for the PR!
Here are some remarks

Copy link
Collaborator

@mathurinm mathurinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Badr-MOUFAD merge if happy!

Copy link
Collaborator

@Badr-MOUFAD Badr-MOUFAD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks @PascalCarrivain for the hard work 💪

Just a small bit before merging

We didn't add tests for the intercept (we might be vulnerable to bugs later)

I'm not aware of any package to check against it the new feature, any suggestions @mathurinm?

With that being said, we can leverage the equivalence between the problem with intercept and without as the datafit is a norm_2 hence setting its gradient w.r.t. the intercept to zero implies setting the residual to zero

$$ \begin{equation} \min_{w, w_0} \| y - (Xw - w_0 \mathbb{1}) \| + \lambda \| w \|_1 \Longleftrightarrow \begin{cases} \min_w & \| \bar{y} - \bar{X}w \| + \lambda \| w \|_1 \\\ \text{where} & w_0 = \text{mean}(y) - \langle \text{mean\_col}(X), \hat{w} \rangle \end{cases} \end{equation} $$

Comment on lines +19 to +22
if sqrt_lasso.fit_intercept:
np.testing.assert_equal(sqrt_lasso.coef_[:-1], 0)
else:
np.testing.assert_equal(sqrt_lasso.coef_, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about this refactoring?

Suggested change
if sqrt_lasso.fit_intercept:
np.testing.assert_equal(sqrt_lasso.coef_[:-1], 0)
else:
np.testing.assert_equal(sqrt_lasso.coef_, 0)
np.testing.assert_equal(sqrt_lasso.coef_[:n_features], 0)

PascalCarrivain added a commit to PascalCarrivain/skglm that referenced this pull request Mar 18, 2024
@mathurinm
Copy link
Collaborator

@PascalCarrivain thanks for bringing this up again ; there's a conflict that needs to be solved

Also @Badr-MOUFAD how could we test against a Lasso solver ? Do you remember some formula linking the regularization strentgth of the Lasso to that of the sqrt Lasso ?
Should we, one some particular data, find by dichotomy two values of lambda that give the same solution for Lasso and SqrtLasso, and then hardcode them in a test ? It sounds good enough for me, there hasn't been a crushing pressure for this feature...

@Badr-MOUFAD
Copy link
Collaborator

Badr-MOUFAD commented Apr 7, 2025

Yes indeed @mathurinm there is a equivalence between SqrtLasso and Lasso
one apparent when $y \notin range(X)$

if we solve Sqrtlasso with $\lambda$ and get solution $\hat{\beta}$
then the equivalent regularization for Lasso that yields the same solution should be $\lambda \lVert y - X \hat{\beta} \rVert $

the latter should be divided by n_samples as in Lasso we optimize the normalized objectives whereas in SqrtLasso we don't

@mathurinm
Copy link
Collaborator

mathurinm commented Apr 16, 2025

Using this works:
$$\lambda_{sqrt} = \frac{n \lambda_{lasso} }{\Vert y - X\hat \beta_{lasso} \Vert}$$

import numpy as np
from numpy.linalg import norm
from skglm import Lasso
from skglm.experimental import SqrtLasso

np.random.seed(0)
X = np.random.randn(10, 20)
y = np.random.randn(10)

n = len(y)

alpha_max = norm(X.T @ y, ord=np.inf) / n

alpha = alpha_max / 10

lass = Lasso(alpha=alpha, fit_intercept=False, tol=1e-8).fit(X, y)
w_lass = lass.coef_
assert norm(w_lass) > 0

scal = n / norm(y - X @ w_lass)

sqrt = SqrtLasso(alpha=alpha * scal, tol=1e-8).fit(X, y)

print(norm(w_lass - sqrt.coef_))  # this is 1e-8

this is not the usual equivalence (one would expect a sqrt(n) not a n, so that $\lambda_{sqrt} = \lambda_{lasso} / \hat{\sigma}$), but this is because our SqrtLasso objective does not have the 1/sqrt(n) normalization in front of the datafit

@floriankozikowski when an intercept is fitted the scaling factor should just be scal = n / norm(y - sqrt.predict(X))

@mathurinm
Copy link
Collaborator

Ok this works on the PR:

import numpy as np
from numpy.linalg import norm
from skglm import Lasso
from skglm.experimental import SqrtLasso

np.random.seed(0)
X = np.random.randn(10, 20)
y = np.random.randn(10)
y += 1

n = len(y)

alpha_max = norm(X.T @ y, ord=np.inf) / n

alpha = alpha_max / 10

lass = Lasso(alpha=alpha, fit_intercept=True, tol=1e-8).fit(X, y)
w_lass = lass.coef_
assert norm(w_lass) > 0

scal = n / norm(y - lass.predict(X))

sqrt = SqrtLasso(alpha=alpha * scal, fit_intercept=True, tol=1e-8).fit(X, y)

but we have a problem, sqrt.coef_ is 1 coefficient too long, because we still store the intercept as last value. This is easy to fix @PascalCarrivain

self.coef_ = self.path(X, y, alphas=[self.alpha])[1][0]
self.intercept_ = 0. # TODO handle fit_intercept
if self.fit_intercept:
self.intercept_ = self.coef_[-1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PascalCarrivain you also need to make self.coef_ shorter by one element in that case, otherwise it still contains the itnercept as last value

mathurinm and others added 27 commits April 22, 2025 19:22
…earn-contrib#137)

Co-authored-by: Badr-MOUFAD <[email protected]>
Co-authored-by: Badr MOUFAD <[email protected]>
Co-authored-by: Quentin Bertrand <[email protected]>
Co-authored-by: mathurinm <[email protected]>
Co-authored-by: mathurinm <[email protected]>
…if correct, clean up, pot. add support for sparse X (not sure if that works), enhance docstring
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@floriankozikowski remove these 3 files

@mathurinm
Copy link
Collaborator

This has become hard to read due to git errors, closing in favor of the restart #298

@mathurinm mathurinm closed this Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH - Add support for intercept in SqrtLasso