Skip to content

Commit ebfa9e2

Browse files
authored
fix: nightlies using v1 can't use model_save_format=safetensors (#1226)
Signed-off-by: Terry Kong <terryk@nvidia.com>
1 parent 0dca729 commit ebfa9e2

File tree

3 files changed

+3
-0
lines changed

3 files changed

+3
-0
lines changed

examples/configs/recipes/llm/grpo-deepscaler-1.5b-8K.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ loss_fn:
88
reference_policy_kl_penalty: 0.0
99
checkpointing:
1010
keep_top_k: 10
11+
model_save_format: null
1112
policy:
1213
model_name: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
1314
train_global_batch_size: 64

examples/configs/recipes/llm/grpo-gemma3-27b-it-8n8g-fsdp2tp8-actckpt-long.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ grpo:
55
max_num_steps: 20
66
checkpointing:
77
checkpoint_dir: results/grpo-gemma3-27b-it-8n8g-fsdp2tp8sp-actckpt-long
8+
model_save_format: null
89
policy:
910
model_name: google/gemma-3-27b-it
1011
tokenizer:

examples/configs/recipes/llm/grpo-gspo-deepscaler-1.5b-8K.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ loss_fn:
1010
token_level_loss: false
1111
checkpointing:
1212
keep_top_k: 10
13+
model_save_format: null
1314
policy:
1415
model_name: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
1516
train_global_batch_size: 64

0 commit comments

Comments
 (0)