-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Description
This is about the huggingface datasets.
Many of them are either compressed csv/json dumps which are not viewable/queryable using the huggingface UI. Have you considered using parquet/duckdb file formats?
I have some scripts to process llama3*.zip files to produce parquet/duckdb. They produce a entity -> event -> event graph. Not sure about concepts graph.
Metadata
Metadata
Assignees
Labels
No labels