Commit 2aaa87f
Phase 2: experiment tracking, data tools, model evaluation
- Add SQLite experiment tracker (~/.soup/experiments.db) with auto-logging
of config, per-step metrics, hardware info, and eval results
- Add soup runs commands: list, show (with plotext loss curves), compare, delete
- Integrate tracker into soup train (auto start_run/finish_run/fail_run)
- Add soup data convert (alpaca/sharegpt/chatml bidirectional conversion)
- Add soup data merge (concatenate datasets with optional shuffle)
- Add soup data dedup (MinHash near-duplicate removal via datasketch)
- Add soup data stats (length percentiles, token counts, language detection)
- Add soup eval (lm-evaluation-harness wrapper with tracker integration)
- Add reverse format conversion: messages_to_format() in data/formats.py
- Add extended_stats() to data/validator.py
- Update monitoring callback to log metrics to tracker
- Add plotext to deps, datasketch as optional [data] dep
- Update README and CLAUDE.md with Phase 2 docs
- 70 tests passing, ruff clean
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent a2a0f2c commit 2aaa87f
File tree
18 files changed
+2005
-47
lines changed- soup_cli
- commands
- data
- experiment
- monitoring
- trainer
- tests
18 files changed
+2005
-47
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
42 | 43 | | |
43 | | - | |
| 44 | + | |
44 | 45 | | |
45 | | - | |
| 46 | + | |
46 | 47 | | |
47 | | - | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
48 | 55 | | |
49 | 56 | | |
50 | 57 | | |
| |||
53 | 60 | | |
54 | 61 | | |
55 | 62 | | |
56 | | - | |
| 63 | + | |
57 | 64 | | |
58 | 65 | | |
59 | 66 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
165 | 210 | | |
166 | 211 | | |
167 | 212 | | |
| |||
178 | 223 | | |
179 | 224 | | |
180 | 225 | | |
181 | | - | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
182 | 229 | | |
183 | 230 | | |
184 | 231 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| 42 | + | |
41 | 43 | | |
42 | 44 | | |
43 | 45 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
24 | 29 | | |
25 | 30 | | |
26 | 31 | | |
| |||
0 commit comments