Skip to content

Commit 1c371a9

Browse files
perf: perf script change for qwen30b-a3b (#1526)
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
1 parent 08534fe commit 1c371a9

File tree

2 files changed

+6
-6
lines changed

2 files changed

+6
-6
lines changed

examples/configs/recipes/llm/performance/grpo-qwen3-30ba3b-4n8g-async-1off.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,10 @@ checkpointing:
1010
checkpoint_dir: results/grpo-qwen3-30ba3b-4n8g-async-1off
1111
policy:
1212
megatron_cfg:
13-
tensor_model_parallel_size: 2
14-
pipeline_model_parallel_size: 1
13+
tensor_model_parallel_size: 1
14+
pipeline_model_parallel_size: 2
1515
expert_model_parallel_size: 8
16-
sequence_parallel: true
16+
sequence_parallel: false
1717
generation:
1818
colocated:
1919
enabled: false
@@ -22,7 +22,7 @@ policy:
2222
gpus_per_node: 8
2323
vllm_cfg:
2424
async_engine: true
25-
tensor_parallel_size: 4
25+
tensor_parallel_size: 2
2626
gpu_memory_utilization: 0.8
2727
logger:
2828
log_dir: logs/grpo-qwen3-30ba3b-4n8g-2T2G-async-1off

examples/configs/recipes/llm/performance/grpo-qwen3-30ba3b-4n8g.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,10 @@ policy:
1717
megatron_cfg:
1818
enabled: true
1919
empty_unused_memory_level: 1
20-
tensor_model_parallel_size: 2
20+
tensor_model_parallel_size: 1
2121
pipeline_model_parallel_size: 1
2222
expert_model_parallel_size: 8
23-
sequence_parallel: true
23+
sequence_parallel: false
2424
optimizer:
2525
lr: 3.0e-07
2626
min_lr: 3.0e-08

0 commit comments

Comments
 (0)