Skip to content

Add initial_weights support for fine-tuning#7

Merged
sirmarcel merged 1 commit intomainfrom
feature/initial-weights
Mar 20, 2026
Merged

Add initial_weights support for fine-tuning#7
sirmarcel merged 1 commit intomainfrom
feature/initial-weights

Conversation

@sirmarcel
Copy link
Copy Markdown
Contributor

Summary

  • Adds initial_weights setting to settings.yaml — loads model weights from a previous run's checkpoint (.msgpack) while starting optimizer, step counter, and data iterator fresh
  • Recursive merge_params supports partial architecture matches (new layers keep random init)
  • Records initial_weights path in saved config for reproducibility
  • Fine-tuning example (my_experiment_finetune/) and tox smoke test
  • Unit tests for merge_params

Test plan

  • uvx tox -e tests — all 22 tests pass
  • uvx ruff format . && uvx ruff check --fix . — clean
  • Fine-tuning example runs end-to-end (DATASETS=.. lorem-train from my_experiment_finetune/)

🤖 Generated with Claude Code

…ghts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@PicoCentauri PicoCentauri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sirmarcel sirmarcel merged commit 35e1889 into main Mar 20, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants