Skip to content

Commit 9e5b0e6

Browse files
resolves the ValueError: Unable to avoid copy while creating an array (#7831)
* Fix argument passing in stratified shuffle split NumPy 2.0 changed the behavior of the `copy=False` parameter to be stricter. When `train_test_split` converted Arrow arrays to NumPy format for stratification, it triggered this error for non-contiguous arrays. Using `np.asarray()` allows copying when necessary, which is the recommended migration path per NumPy 2.0 documentation. * make style --------- Co-authored-by: Quentin Lhoest <[email protected]>
1 parent 627ed2e commit 9e5b0e6

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/datasets/arrow_dataset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4868,7 +4868,7 @@ def train_test_split(
48684868
try:
48694869
train_indices, test_indices = next(
48704870
stratified_shuffle_split_generate_indices(
4871-
self.with_format("numpy")[stratify_by_column], n_train, n_test, rng=generator
4871+
np.asarray(self.with_format("numpy")[stratify_by_column]), n_train, n_test, rng=generator
48724872
)
48734873
)
48744874
except Exception as error:

0 commit comments

Comments
 (0)