-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Closed
Labels
EnhancementMultiIndexNeeds InfoClarification about behavior needed to assess issueClarification about behavior needed to assess issue
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
I wish I could smoothly apply sklearn.model_selection.ShuffleSplit
with a pandas.Dataframe
that has a pandas.MultiIndex
. I tried to apply it, but it throws some weird KeyError
.
Feature Description
This is my pseudo code how I would expect it to work:
from sklearn.model_selection import ShuffleSplit
from sktime.datatypes import get_examples
df = get_examples(mtype="pd-multiindex", as_scitype="Panel")[0]
splitter = ShuffleSplit(n_splits=3, random_state=42)
split = splitter.split(df.index.levels[0])
train_indexes = []
test_indexes = []
for train_index, test_index in split:
train_indexes.append(train_index)
test_indexes.append(test_index)
x_train, x_test = (df.loc[train_indexes[0]], df.loc[test_indexes[0]])
Alternative Solutions
Currently I only to a train_test_split as described here without any cross validation.
Additional Context
It would also be nice to extend this to the other split methods of sklearn.
Metadata
Metadata
Assignees
Labels
EnhancementMultiIndexNeeds InfoClarification about behavior needed to assess issueClarification about behavior needed to assess issue