Skip to content

Commit 6b394db

Browse files
author
PaulWestenthanner
committed
Use new sklearn tagging concept. fixes #448
1 parent 2a2e1a1 commit 6b394db

22 files changed

+225
-190
lines changed

.github/workflows/docs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jobs:
1717
- name: Directly build docs
1818
run: |
1919
pip install -r docs/requirements.txt
20-
sphinx-build -D docs/source ./docs/build/html/
20+
sphinx-build docs/source ./docs/build/html/
2121
- name: Deploy Docs
2222
uses: peaceiris/actions-gh-pages@v3
2323
with:

CHANGELOG.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
1+
v.2.8.0
2+
=======
3+
4+
* Fix: Support new concept of sklearn tags, now requiring sklearn >= 1.6.0
5+
* Fix: Docs deployment
6+
17
v.2.7.0
2-
==========
8+
=======
39

410
* Refactor: Use poetry as packaging tool
511
* Refactor: Add more typing

CONTRIBUTING.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,19 +35,25 @@ The preferred workflow to contribute to git-pandas is:
3535
Guidelines
3636
==========
3737
38-
This is still a very young project, but we do have a few guiding principles:
3938
4039
1. Maintain semantics of the scikit-learn API
4140
2. Write detailed docstrings in numpy format
4241
3. Support pandas dataframes and numpy arrays as inputs
4342
4. Write tests
4443
44+
Styleguide:
45+
46+
We're using ruff for linting. Rules are implemented in the `pyproject.toml` file. To run the linter, use:
47+
48+
$ poetry run ruff check category_encoders --fix
49+
50+
4551
Running Tests
4652
=============
4753
4854
To run the tests, use:
4955
50-
$ pytest
56+
$ poetry run pytest tests/
5157
5258
Easy Issues / Getting Started
5359
=============================

category_encoders/base_contrast_encoder.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
__author__ = 'paulwestenthanner'
1313

1414

15-
class BaseContrastEncoder(util.BaseEncoder, util.UnsupervisedTransformerMixin):
15+
class BaseContrastEncoder(util.UnsupervisedTransformerMixin, util.BaseEncoder):
1616
"""Base class for various contrast encoders.
1717
1818
Parameters

category_encoders/basen.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ def _ceillogint(n, base):
3434
return ret
3535

3636

37-
class BaseNEncoder(util.BaseEncoder, util.UnsupervisedTransformerMixin):
37+
class BaseNEncoder( util.UnsupervisedTransformerMixin,util.BaseEncoder):
3838
"""Base-N encoder encodes the categories into arrays of their base-N representation.
3939
4040
A base of 1 is equivalent to one-hot encoding (not really base-1, but useful),

category_encoders/cat_boost.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
__author__ = 'Jan Motl'
1010

1111

12-
class CatBoostEncoder(util.BaseEncoder, util.SupervisedTransformerMixin):
12+
class CatBoostEncoder(util.SupervisedTransformerMixin, util.BaseEncoder):
1313
"""CatBoost Encoding for categorical features.
1414
1515
Supported targets: binomial and continuous.
@@ -202,10 +202,10 @@ def _transform(self, X, y=None):
202202

203203
return X
204204

205-
def _more_tags(self) -> dict[str, bool]:
205+
def __sklearn_tags__(self) -> util.EncoderTags:
206206
"""Set scikit transformer tags."""
207-
tags = super()._more_tags()
208-
tags['predict_depends_on_y'] = True
207+
tags = super().__sklearn_tags__()
208+
tags.predict_depends_on_y = True
209209
return tags
210210

211211
def _fit_column_map(self, series, y):

category_encoders/count.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
__author__ = 'joshua t. dunn'
1212

1313

14-
class CountEncoder(util.BaseEncoder, util.UnsupervisedTransformerMixin):
14+
class CountEncoder( util.UnsupervisedTransformerMixin,util.BaseEncoder):
1515
"""Count encoding for categorical features.
1616
1717
For a given categorical feature, replace the names of the groups with the group counts.

category_encoders/glmm.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
__author__ = 'Jan Motl'
1616

1717

18-
class GLMMEncoder(util.BaseEncoder, util.SupervisedTransformerMixin):
18+
class GLMMEncoder( util.SupervisedTransformerMixin ,util.BaseEncoder):
1919
"""Generalized linear mixed model.
2020
2121
Supported targets: binomial and continuous.
@@ -164,10 +164,10 @@ def _transform(self, X, y=None):
164164
X = self._score(X, y)
165165
return X
166166

167-
def _more_tags(self) -> dict[str, bool]:
167+
def __sklearn_tags__(self) -> util.EncoderTags:
168168
"""Set scikit transformer tags."""
169-
tags = super()._more_tags()
170-
tags['predict_depends_on_y'] = True
169+
tags = super().__sklearn_tags__()
170+
tags.predict_depends_on_y = True
171171
return tags
172172

173173
def _train(self, X, y):

category_encoders/hashing.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
__author__ = 'willmcginnis', 'LiuShulun'
1515

1616

17-
class HashingEncoder(util.BaseEncoder, util.UnsupervisedTransformerMixin):
17+
class HashingEncoder( util.UnsupervisedTransformerMixin,util.BaseEncoder):
1818
"""A multivariate hashing implementation with configurable dimensionality/precision.
1919
2020
The advantage of this encoder is that it does not maintain a dictionary of observed categories.

category_encoders/james_stein.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
__author__ = 'Jan Motl'
1212

1313

14-
class JamesSteinEncoder(util.BaseEncoder, util.SupervisedTransformerMixin):
14+
class JamesSteinEncoder( util.SupervisedTransformerMixin,util.BaseEncoder):
1515
"""James-Stein estimator.
1616
1717
Supported targets: binomial and continuous.
@@ -228,10 +228,10 @@ def _transform(self, X, y=None):
228228
X = self._score(X, y)
229229
return X
230230

231-
def _more_tags(self) -> dict[str, bool]:
231+
def __sklearn_tags__(self) -> util.EncoderTags:
232232
"""Set scikit transformer tags."""
233-
tags = super()._more_tags()
234-
tags['predict_depends_on_y'] = True
233+
tags = super().__sklearn_tags__()
234+
tags.predict_depends_on_y = True
235235
return tags
236236

237237
def _train_pooled(self, X, y):

0 commit comments

Comments
 (0)