Skip to content

How to shuffle & optimize Hugging Face datasets for LLM pre-training with StreamingDataset ? #1229

How to shuffle & optimize Hugging Face datasets for LLM pre-training with StreamingDataset ?

How to shuffle & optimize Hugging Face datasets for LLM pre-training with StreamingDataset ? #1229

Triggered via issue August 22, 2025 17:20
Status Success
Total duration 5s
Artifacts

greetings.yml

on: issues
greeting
3s
greeting
Fit to window
Zoom out
Zoom in

Annotations

1 warning
greeting
Unexpected input(s) 'repo-token', 'issue-message', 'pr-message', valid inputs are ['issue_message', 'pr_message', 'repo_token']