use 1-sqrt warmdown shape for LR schedule by spjosyula · Pull Request #513 · karpathy/nanochat

spjosyula · 2026-02-08T07:13:28Z

Replace linear warmdown with 1-sqrt cooldown shape (LR = 1 - sqrt(x)) in
base_train.py and chat_sft.py. Left chat_rl.py unchanged since its schedule is pure decay with no stable phase.
The warmdown shape has never been changed since it was inherited from
modded-nanogpt: the ratio was swept (0.2 → 0.4 → 0.5) but the linear
curve itself was kept as-is. 1-sqrt drops LR faster early in warmdown
then flattens, which should reduce gradient noise sooner while keeping
more steps at low-but-nonzero LR for convergence.
Needs a d12 run to validate.

svlandeg

Have you done any experiments to validate the change?

spjosyula · 2026-02-09T16:49:15Z

Have you done any experiments to validate the change?

No experiments yet. As noted in the description, this needs a d12 run.

Why I think it's worth one:

The LR multiplier applies to all param groups uniformly, so the shape change affects every parameter across
both AdamW and Muon
Warmdown is 50% of training: that's a large surface for the curve shape to matter
1-sqrt outperformed linear in WSD schedules (Hagele et al., NeurIPS 2024 - already cited in the code at L346)

The gap is that those results were with AdamW and if the shape shifts the optimal warmdown ratio, that may need
re-sweeping too.

use 1-sqrt warmdown shape for LR schedule

23acb17

svlandeg added the potential_improvement label Feb 9, 2026

svlandeg reviewed Feb 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use 1-sqrt warmdown shape for LR schedule#513

use 1-sqrt warmdown shape for LR schedule#513
spjosyula wants to merge 1 commit intokarpathy:masterfrom
spjosyula:sqrt-warmdown

spjosyula commented Feb 8, 2026 •

edited

Loading

Uh oh!

svlandeg left a comment

Uh oh!

spjosyula commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

spjosyula commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

svlandeg left a comment

Choose a reason for hiding this comment

Uh oh!

spjosyula commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spjosyula commented Feb 8, 2026 •

edited

Loading