Skip to content

Conversation

@Dr4k3z
Copy link

@Dr4k3z Dr4k3z commented Oct 22, 2025

When following the instructions on how to define our own estimator within the sklearn framework, I went through this repo.

Although I found it quite useful, I feel like it could be improved and updated, to be in line with most recent scikit-learn versions. I believe the code is compatible with sklearn version 1.4.2. For instance, I noticed the code uses an old API for data validation, like here

# `_validate_data` is defined in the `BaseEstimator` class.
# It allows to:
# - run different checks on the input data;
# - define some attributes associated to the input data: `n_features_in_` and
# `feature_names_in_`.
X, y = self._validate_data(X, y, accept_sparse=True)

According to this, since version 1.6.0 the validate_data api is no longer part of the Base.BaseEstimator class, but rather inside of sklearn.utils.validation module.
In this pr I made this tiny change to the _template.py source code. I've briely run the unit tests and apart from the following error

FAILED tests/test_common.py::test_estimators[TemplateEstimator()-check_estimator_sparse_tag] - AssertionError: Estimator TemplateEstimator didn't fail when fitted on sparse data but should have according to its tag self.input_tags.sparse=False. The t...
FAILED tests/test_common.py::test_estimators[TemplateTransformer()-check_estimator_tags_renamed] - TypeError: Estimator TemplateTransformer has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need t...
FAILED tests/test_common.py::test_estimators[TemplateTransformer()-check_estimator_sparse_tag] - AssertionError: Estimator TemplateTransformer didn't fail when fitted on sparse data but should have according to its tag self.input_tags.sparse=False. The...
FAILED tests/test_common.py::test_estimators[TemplateTransformer()-check_transformers_unfitted] - AssertionError: The unfitted transformer TemplateTransformer does not raise an error when transform is called. Perhaps use check_is_fitted in transform.

all other 150 test cases seem to run smoothly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant