- 
                Notifications
    
You must be signed in to change notification settings  - Fork 404
 
Open
Description
Expected Behavior
Even if the category_encoders.one_hot.OneHotEncoder doesn't encode any features, we would expect it to convert a pd.DataFrame into a numpy.ndarray if we set the parameter :
return_df=False
Actual Behavior
When the category_encoders.one_hot.OneHotEncoder deals with a dataframe with only numerical features, the parameter cols is empty and the parameter return_df=False, the fit_transform method returns a pd.DataFrame object.
Steps to Reproduce the Problem
import numpy as np
import pandas as pd
from category_encoders.one_hot import OneHotEncoder
rng = np.random.RandomState(42)
This works
n_rows = 100
col1 = rng.rand(n_rows) * 100
col2 = rng.randint(1, 100, n_rows)
col3 = rng.choice([True, False], n_rows)
modalities = ['A', 'B', 'C', 'D']
col4 = rng.choice(modalities, n_rows)
df = pd.DataFrame({
    'Numeric1': col1,
    'Numeric2': col2,
    'Boolean': col3,
    'Object': col4
})
encoder = OneHotEncoder(
    cols=df.select_dtypes(include=["object", "bool"]).columns,
    return_df=False,
    handle_missing='return_nan'
)
X = encoder.fit_transform(df)
type(X)
Out: pandas.core.frame.DataFrame
This is the unexpected behavior
data = rng.multivariate_normal(mean=[0, 0], cov=[[1, 0], [0, 1]], size=200)
df = pd.DataFrame(data=data, columns=["Column 1", "Column 2"])
encoder = OneHotEncoder(
    cols=df.select_dtypes(include=["object", "bool"]).columns,
    return_df=False,
    handle_missing='return_nan'
)
X = encoder.fit_transform(df)
type(X)
Out: numpy.ndarray
Specifications
- Version: 2.6.3
 - Platform: macOS Sonoma 14.6.1
 
Metadata
Metadata
Assignees
Labels
No labels