- Name:
- Source:
- Size: <e.g., 300GB, 440k videos>
- License:
- Path:
- Access Method: <download/access command>
- Train/Val/Test Ratio: <e.g., 70/15/15>
- Counts:
- Train:
- Val:
- Test:
- Method: <how split was created, including seed/stratification and grouping rules>
- <step 1>
- <step 2>
- <step 3>
- Updated
<path/to/file> - Added
<path/to/file> - Modified
<path/to/file>
# Step 1: Download/locate dataset
<command>
# Step 2: Split dataset
<command>
# Step 3: Run preprocessing
<command>- Ran preprocessing end-to-end
- Verified split counts match expectations
- Verified processed data looks correct (sample check)
- No dataset files committed
- Documentation updated
<gotchas, assumptions, and important details for teammates>