Replies: 2 comments 8 replies
-
|
Hmm yea that seems like too much. @nqbao can I ask you to see how much is used by The dataset itself is 3.58gb on huggingface |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
I think we can change |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was running evaluation for MS-MARCO-VN dataset (which is huge) on a container with 30gb disk space. After 5hr the task failed due to out of disk
I know i can always provision more disk space, but is this expected to use this much? Most of this comes from datasets caching. Should i disable caching or is there a way to clean up cache for each batch?
Beta Was this translation helpful? Give feedback.
All reactions