Skip to content

Conversation

@dennisobrien
Copy link
Contributor

Fixes #460

Proposed Changes

  • This fixes a bug where a TargetEncoder passed a non-null hierarchy object would throw an exception when used in a scikit-learn ColumnTransformer.
  • This moves some logic and validation from the TargetEncoder __init__ method to the fit method in order to pass validation tests in the scikit-learn clone method.
  • Added a test for this scenario: test_hierarchy_with_scikit_learn_column_transformer.

@PaulWestenthanner
Copy link
Collaborator

PaulWestenthanner commented Oct 31, 2025

This fix looks great, code is cleaner than before, test is also nice! Great job and sorry again for having you wait so long.
Can you please also rebase with the current master. I've updated some sphinx versions in order for the CI pipeline to work, it somehow broke down with the old versions.

… that clone the object and validate that the parameters are unchanged.

This moves the validation of hierarchy and the generation of an inverted hiearchy to the helper function _generate_inverted_hierarchy.
This function is called in __init__ only to keep compatibility with existing tests by throwing an exception from the constructure rather than waiting for the call to fit.
We create a variable inverted_hierarchy rather than overwriting self.hierarchy in order for the clone method to pass validation.
@dennisobrien dennisobrien force-pushed the issue-460-fix_targetencoder branch from ac4bc92 to 106af76 Compare November 1, 2025 18:29
Removed ToDo comment based on feedback.
Added type hint to test_hierarchy_with_scikit_learn_column_transformer.
@dennisobrien
Copy link
Contributor Author

This fix looks great, code is cleaner than before, test is also nice! Great job and sorry again for having you wait so long. Can you please also rebase with the current master. I've updated some sphinx versions in order for the CI pipeline to work, it somehow broke down with the old versions.

Thanks for the feedback. And no problem about the timeline. I appreciate your contributions in creating and maintaining this project.

I made the changes you mentioned and followed up in each of the comments.

Let me know if there is anything else you'd like me to change in this.

@PaulWestenthanner
Copy link
Collaborator

LGTM & pipeline passing. I'm merging

@PaulWestenthanner PaulWestenthanner merged commit 370e4b8 into scikit-learn-contrib:master Nov 2, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TargetEncoder cannot be cloned by scikit-learn when hierarchy is passed

2 participants