Skip to content

Commit 925acdf

Browse files
authored
Add instructions for the post-training steps (#222)
* Add instructions for the post-training steps * Minor grammar and spaces corrections * Specify GAS parameter to have the correct EBS on 1 node
1 parent 9fb76af commit 925acdf

File tree

2 files changed

+16
-6
lines changed

2 files changed

+16
-6
lines changed

recipes/smollm3/README.md

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,20 @@
1-
21
# Instructions to train SmolLM3-3B
32

4-
We are open-sourcing all the artifacts to train [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B). You can find the configuration files for the three post-training stages in the [`sft`](https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm3/sft) and [`dpo`](https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm3/dpo) directories.
5-
6-
We are currently working on the code release, so this README will contain the instructions to run training after we release the code on the week of July 14, 2025.
3+
We are open-sourcing all the artifacts to train [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B). You can find the configuration files for the three post-training stages (mid-training, SFT, and DPO) in the [`sft`](https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm3/sft) and [`dpo`](https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm3/dpo) directories.
74

85
## Setup
96

10-
[WIP]
7+
Make sure you followed the installation instructions in the [README.md](README.md) file. We tested the training setup with 8 GPUs (80GB of VRAM) to train the full model.
8+
9+
## Full training examples
10+
11+
```shell
12+
# Step 1 - Mid-Training
13+
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml scripts/sft.py --config recipes/smollm3/sft/mid.yaml --gradient_accumulation_steps 16
14+
15+
# Step 2 - SFT
16+
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml scripts/sft.py --config recipes/smollm3/sft/sft.yaml --gradient_accumulation_steps 16
17+
18+
# Step 2 - DPO
19+
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml scripts/dpo.py --config recipes/smollm3/dpo/apo.yaml --gradient_accumulation_steps 4
20+
```

recipes/smollm3/sft/sft.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Config for 8 nodes
1+
# Config for 8 nodes with GBS 128
22
# Model arguments
33
model_name_or_path: HuggingFaceTB/SmolLM3-3B-checkpoints
44
model_revision: it-mid-training

0 commit comments

Comments
 (0)