Skip to content

Commit 8e56d09

Browse files
committed
fix mistakes and add acknowledgement
1 parent 53205e2 commit 8e56d09

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

_posts/2022-11-08-pandas-dataframe-output-for-sklearn-transformer.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Pandas DataFrame output for Sklearn Transformers"
2+
title: "Pandas DataFrame Output for sklearn Transformers"
33
date: November 8, 2022
44
categories:
55
- Technical
@@ -22,21 +22,23 @@ postauthors:
2222
<iframe width="560" height="315" src="https://www.youtube.com/embed/5bCg8VfX2x8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
2323

2424
## Upcoming feature in release 1.2
25-
Starting with the next release of [scikit-learn](https://github.com/scikit-learn/scikit-learn) (v1.2), pandas dataframe output will be available for all sklearn transformers! This will make running pipelines on dataframes much easier and provide better ways to track feature names. Previously, mapping a transformed output back into columns would be cumbersome as it might not be a one-to-one mapping in cases of complex preprocessing (e.g., polynomial features ).
25+
Starting with the next release of [scikit-learn](https://github.com/scikit-learn/scikit-learn) (v1.2), pandas dataframe output will be available for all sklearn transformers! This will make running pipelines on dataframes much easier and provide better ways to track feature names. Previously, mapping a transformed output back into columns would be cumbersome as it might not be a one-to-one mapping in cases of complex preprocessing (e.g., polynomial features).
2626

2727
The pandas dataframe output feature for transformers solves this by tracking features generated from pipelines automatically. The transformer output format can be configured explictly for either **numpy** or **pandas** output formats as shown in [sklearn.set_config](https://scikit-learn.org/dev/modules/generated/sklearn.set_config.html#sklearn.set_config) and the sample code below.
2828
```python
2929
from sklearn import set_config
3030
set_config(transform_output = "pandas")
3131
```
3232

33-
Please see the sample notebook and documentation for a more detailed example and usage.
33+
See the sample notebook, [pandas-dataframe-output-for-sklearn-transformer.ipynb](https://github.com/scikit-learn/blog/blob/main/assets/notebooks/sklearn-pandas-df-output.ipynb) and documentation for a more detailed example and usage.
3434

3535
## Links to documentation and example notebook:
3636
- [Pandas output for transformers documentation](https://scikit-learn.org/dev/auto_examples/miscellaneous/plot_set_output.html#sphx-glr-auto-examples-miscellaneous-plot-set-output-py)
37-
- [Sample notebook](https://github.com/scikit-learn/blog/blob/main/assets/notebooks/sklearn-pandas-df-output.ipynb)
37+
- [pandas-dataframe-output-for-sklearn-transformer.ipynb](https://github.com/scikit-learn/blog/blob/main/assets/notebooks/sklearn-pandas-df-output.ipynb)
3838

3939

4040
## Reporting bugs:
4141
We'd love your feedback on this. In case of any suggestions or bugs, please report them at
42-
[scikit-learn issues](https://github.com/scikit-learn/scikit-learn/issues)
42+
[scikit-learn issues](https://github.com/scikit-learn/scikit-learn/issues)
43+
44+
Thanks 🙏🏾 to maintainers: [**Thomas J. Fan**](https://github.com/thomasjpfan), [**Guillaume Lemaitre**](https://github.com/glemaitre) , [**Christian Lorentzen**](https://github.com/lorentzenchr) !!

0 commit comments

Comments
 (0)