You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is already a decent synergy between pandas and scikit-learn and most other popular machine learning libraries, as in a pandas DataFrame is almost always accepted as an input data structure.
However, the output of the scikit-learn transformers is a pure numpy array, and thus one loses the column name information of the input data. Preserving the column names through the ML pipeline would be extremely useful to data scientists to optimize/understand/debug data science pipelines.