Skip to content

Conversation

@floriankozikowski
Copy link
Contributor

Context of the PR

This PR is a first step toward making GeneralizedLinearEstimator compatible with scikit-learn's GridSearchCV. (closes Issue #255 )

Disclaimer

  • This PR is (for now) intentionally minimal and is meant to serve as a discussion point (not to be merged)
  • A complete solution will require a more extensive refactor to ensure that all components (and their compiled forms) are compatible with scikit-learn's parameter management and estimator API.

Contributions of the PR

  • Implements a minimal version of get_params and set_params in GeneralizedLinearEstimator to support nested parameter access (e.g., penalty__alpha, solver__max_iter).
  • Allows GridSearchCV to set and search over nested parameters for the estimator, penalty, datafit, and solver.
  • Demonstrates that, while this enables basic grid search, it also exposes a range of errors and incompatibilities elsewhere in the codebase (see below).

Requirements of GridSearchCV:

  • scikit-learn's GridSearchCV expects all tunable parameters to be accessible via the estimator's get_params/set_params methods, including nested parameters (using the component__param syntax).
  • When GridSearchCV clones and sets parameters, the estimator and all its components must be able to accept these parameters and update their state accordingly.

Challenges in skglm:

  • Many skglm components (penalties, datafits, solvers) are compiled or wrapped in ways that break the scikit-learn parameter interface.
  • When new component instances are created (e.g., via set_params), they may lose methods or attributes required for fitting, or may not be properly compiled.
  • The current working draft works for the specific case of grid search, but breaks other parts of the codebase, as shown by failing tests (e.g., missing initialize methods on compiled classes).
  • Achieving full compatibility will probably require a more fundamental architectural change

Checks before merging PR

  • added documentation for any new feature
  • added unit tests
  • edited the what's new (if applicable)

@floriankozikowski
Copy link
Contributor Author

too large, changes too many things (look at #311 for newer version)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant