Upgrade data mixer, deps, and scripts #221

lewtun · 2025-07-24T01:19:14Z

This PR is a major refactor to bring the project in line with recent versions of TRL and to enable support for the SmolLM3 recipe. Some things may differ slightly from before (e.g. the train/test splits from the data mixture), but overall functionality should be the same.

TODO

Test SFT and DPO work with Zephyr
Fix the SmolLM3 recipe to use public models and test each step runs
Refactor ORPO and test it works
Update all remaining configs with the new dataset mixer
Update unit tests

HuggingFaceDocBuilderDev · 2025-07-24T02:50:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…nto upgrade

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

lewtun · 2025-07-24T19:14:58Z

cc @BramVanroy I have removed your CPT recipe for now as the mid-training dataset isn't compatible with this version of datasets (loading scripts deprecated) and the SFT model of your seems to have been deleted. Happy to add it back once the artifacts are back online!

lewtun added 8 commits August 19, 2024 06:55

Fix unit test

675649e

Merge branch 'main' into fix-unit-test

fbe98e3

Fix chat template tests

338fbb1

Remove deprecated test

2fe3652

up

ca4fe27

Clean up

bc9f358

Refactor to the max

ccee178

Merge branch 'main' into upgrade

689cd18

lewtun added 14 commits July 24, 2025 02:50

foo

e5c9520

Fix

a741fd6

Make orpo work

cf54dc8

Fix CAI

1682e90

Fix README

aa48ecf

Fix slurm launcher

ee8a06a

Fix configs

3f738da

Fix smollm

2fd50c6

Fix smollm1 and smollm2

d8ebaec

Fix tests

ab38048

Fix stachat

92d880e

Fix gemma

868d306

Fix mixtral

9662842

Fix

c8cc45a

lewtun changed the title ~~[WIP] Upgrade data mixer, deps, and scripts~~ Upgrade data mixer, deps, and scripts Jul 24, 2025

lewtun and others added 4 commits July 24, 2025 18:28

Fix tests

7aec349

Update recipes with published checkpoints

e3dbdd2

Merge branch 'upgrade' of github.com:huggingface/alignment-handbook i…

dbaf637

…nto upgrade

Update news date in README

99bfe35

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

lewtun merged commit 19f0345 into main Jul 24, 2025
2 checks passed

lewtun deleted the upgrade branch July 24, 2025 19:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade data mixer, deps, and scripts #221

Upgrade data mixer, deps, and scripts #221

Uh oh!

lewtun commented Jul 24, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jul 24, 2025

Uh oh!

lewtun commented Jul 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Upgrade data mixer, deps, and scripts #221

Upgrade data mixer, deps, and scripts #221

Uh oh!

Conversation

lewtun commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Uh oh!

HuggingFaceDocBuilderDev commented Jul 24, 2025

Uh oh!

lewtun commented Jul 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lewtun commented Jul 24, 2025 •

edited

Loading