Skip to content
Discussion options

You must be logged in to vote

It mixes

def merge_datasets(datasets: list[Dataset], cfg: DictDefault) -> Dataset:
"""Merge multiple datasets into one with optional shuffling.
Args:
datasets: List of datasets to merge.
cfg: Configuration object containing shuffle settings.
Returns:
Merged dataset.
"""
if len(datasets) == 1:
ds = datasets[0]
# Do not shuffle if curriculum sampling is enabled or
# shuffle_merged_datasets is disabled
if cfg.curriculum_sampling or not cfg.shuffle_merged_datasets:
return ds
return ds.shuffle(seed=cfg.seed)
L…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@NanoCode012
Comment options

Answer selected by medemi68
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants