-
Notifications
You must be signed in to change notification settings - Fork 233
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
The motivation behind these kind of "datasets" is very odd imo.
Why not just enforce a single dataset class - and call it a day! Anyone can write a simple script to download a dataset from HF and convert it to open-ai format, right? Also, there is almost zero usecase where you just take one dataset and train on it (at least to build high quality aligned models) its always a combination of a huge set of datasets, each with different formats etc. This is all data-munging work that a toolkit should not be entangled in.
Premature notions of "convenience" just end up being code debt.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested