-
Notifications
You must be signed in to change notification settings - Fork 85
Open
Description
Hi,
I had an issue with the resampled data containing lots of NaN values and thus SMOGN not running.
For anyone who is familiar with it: oops! synthetic data contains missing values
During debugging I figured out, that the NaN values only occur on categorical variables.
Two fixes for anyone encountering the problem:
- Fix on data side
Change all the column in your dataframe from typecategoryto typeobject
data[column] = data[column].astype("object") - Fix on SMOGN side
Insmogn.over_samplingchange
nom_dtypes = ["object", "bool", "datetime64"]
to
nom_dtypes = ["object", "bool", "datetime64", "category"]
Took me a bit of time to figure it out. Hope it helps π
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels