This repository was archived by the owner on Jul 24, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 57
Inaccurate porting of covariance vs naive method #82
Copy link
Copy link
Open
Description
python-glmnet/glmnet/linear.py
Lines 288 to 293 in 813c06f
| if X.shape[1] > X.shape[0]: | |
| # the glmnet docs suggest using a different algorithm for the case | |
| # of p >> n | |
| algo_flag = 2 | |
| else: | |
| algo_flag = 1 |
glmnet actually does a slightly different check than just a "n" vs "p" comparison like this. It invokes method 1 (covariance method) if p <= 500. The covariance method keeps track of a matrix of covariances C(i,j) for every feature i and every active feature j. And under the hood, C is allocated as a pxp matrix (even though we use much less memory than that usually); this was done out of simplicity because it's very hard to write clever data structures in Fortran. So even when n >> p, if p is also large, this is not a viable default option on most machines.
Anyways, I'd suggest changing to
if X.shape[1] <= 500:
algo_flag = 1
else:
algo_flag = 2Metadata
Metadata
Assignees
Labels
No labels