fix: correct CSV extraction in scaling_laws.sh by geopti · Pull Request #580 · karpathy/nanochat

geopti · 2026-02-28T16:43:55Z

Bug fixes

Two bugs in runs/scaling_laws.sh cause the results CSV to be silently wrong.

Bug 1 — parameter columns always empty

base_train.py prints parameter counts with 24-character key padding:

wte                     : 33,554,432
value_embeds            : 150,994,944
transformer_matrices    : 167,772,160
...

The grep patterns used "wte:", "value_embeds:" etc., which never match because there are spaces between the key and the colon. Some parameter columns in the CSV are silently empty.

Fix: use grep -P "wte\s+:" to handle the padding.

Bug 2 — tokens_trained always wrong

tokens_trained was computed as NUM_ITERS * 524288, hardcoding the default batch size. But base_train.py auto-computes the optimal batch size based on the FLOPs budget and model size — it may differ from 524288, especially for small models at small FLOPs budgets.

Fix: extract tokens_trained directly from the log line "Total number of training tokens: X".

Two bugs caused all parameter columns and tokens_trained to be silently empty/wrong in the results CSV: 1. Parameter grep patterns did not account for the padded key format. base_train.py prints parameters as `{key:24s}: {value:,}`, e.g. `wte : 33,554,432`, so patterns like `grep "wte:"` never matched. Fixed by using `grep -P "wte\s+:"` to handle the spaces. 2. tokens_trained was hardcoded as `NUM_ITERS * 524288`, but the batch size is auto-computed by base_train.py and may differ from 524288 depending on the FLOPs budget and model size. Fixed by extracting the actual value from the log line "Total number of training tokens: X".

…g batch size Same bug as scaling_laws.sh: TOKENS_TRAINED was computed as NUM_ITERS * 524288, hardcoding the default total batch size. When base_train auto-computes a different batch size, the value is wrong. Fix by reading "Total number of training tokens:" directly from the training log.

svlandeg added potential_bug Needs investigation/confirmation whether or not it's a bug scripts Edits in the bash scripts labels Feb 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct CSV extraction in scaling_laws.sh#580

fix: correct CSV extraction in scaling_laws.sh#580
geopti wants to merge 2 commits intokarpathy:masterfrom
geopti:fix/scaling-laws-csv-extraction

geopti commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

geopti commented Feb 28, 2026

Bug fixes

Bug 1 — parameter columns always empty

Bug 2 — tokens_trained always wrong

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants