Skip to content

early_stopping_best_model_checkpointing#41

Open
thanay-sisir wants to merge 2 commits intoSakanaAI:mainfrom
thanay-sisir:best_model_checkpoint
Open

early_stopping_best_model_checkpointing#41
thanay-sisir wants to merge 2 commits intoSakanaAI:mainfrom
thanay-sisir:best_model_checkpoint

Conversation

@thanay-sisir
Copy link
Copy Markdown
Contributor

@thanay-sisir thanay-sisir commented Feb 4, 2026

Summary

  • Add optional early stopping via --early_stopping_patience N flag (disabled by default with -1)
  • Add optional best model checkpointing via --save_best_checkpoint flag (disabled by default)
  • Applied to all 6 training scripts: image_classification, mazes, parity, qamnist, sort, and rl
  • Default behavior is completely unchanged — both features are opt-in only
  • Backward-compatible checkpoint loading using .get() with defaults

Test plan

  • Verify all 6 training scripts parse correctly
  • Run without new flags to confirm default behavior is unchanged
  • Run with --early_stopping_patience 5 --save_best_checkpoint to verify early stopping triggers and best_checkpoint.pt is saved
  • Run golden tests: python -m pytest tests/tests.py

Add two new opt-in CLI flags to all 6 training scripts:
- --early_stopping_patience N: stop training if test metric does not
  improve for N consecutive validations (-1 to disable, default)
- --save_best_checkpoint: save best model as best_checkpoint.pt

Both features are disabled by default so existing behavior is
completely unchanged. Checkpoint loading is backward compatible
using .get() with defaults.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant