You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/sphinx_doc/source/tutorial/faq.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,7 +96,7 @@ ray start --head
96
96
97
97
**A:** The following parameters may be helpful:
98
98
99
-
- For trainer, adjust `actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu` when `actor_rollout_ref.actor.use_dynamic_bsz=false`; adjust `actor_rollout_ref.actor.ppo_max_token_len_per_gpu` and `actor_rollout_ref.actor.ulysses_sequence_parallel_size` when `actor_rollout_ref.actor.use_dynamic_bsz=true`.
99
+
- For trainer, adjust `actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu` when `actor_rollout_ref.actor.use_dynamic_bsz=false`; adjust `actor_rollout_ref.actor.ppo_max_token_len_per_gpu` and `actor_rollout_ref.actor.ulysses_sequence_parallel_size` when `actor_rollout_ref.actor.use_dynamic_bsz=true`. Setting `actor_rollout_ref.actor.entropy_from_logits_with_chunking=true` may also help.
100
100
- For explorer, adjust `explorer.rollout_model.tensor_parallel_size`,
Copy file name to clipboardExpand all lines: docs/sphinx_doc/source/tutorial/trinity_configs.md
+13-1Lines changed: 13 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -443,8 +443,11 @@ actor_rollout_ref:
443
443
ppo_epochs: 1
444
444
shuffle: False
445
445
ulysses_sequence_parallel_size: 1 # sp size
446
+
entropy_from_logits_with_chunking: false
447
+
entropy_checkpointing: false
446
448
checkpoint:
447
-
contents: ['model', 'hf_model', 'optimizer', 'extra'] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
449
+
load_contents: ['model', 'optimizer', 'extra']
450
+
save_contents: ['model', 'optimizer', 'extra']
448
451
optim:
449
452
lr: 1e-6
450
453
lr_warmup_steps_ratio: 0. # the total steps will be injected during runtime
@@ -458,17 +461,22 @@ actor_rollout_ref:
458
461
param_offload: False
459
462
optimizer_offload: False
460
463
fsdp_size: -1
464
+
forward_prefetch: False
461
465
ref:
462
466
fsdp_config:
463
467
param_offload: False
464
468
wrap_policy:
465
469
# transformer_layer_cls_to_wrap: None
466
470
min_num_params: 0
471
+
fsdp_size: -1
472
+
forward_prefetch: False
467
473
# log_prob_micro_batch_size: 4 # will be deprecated, use log_prob_micro_batch_size_per_gpu
- `actor_rollout_ref.actor.use_dynamic_bsz`: Whether to reorganize the batch data, specifically to splice the shorter data to reduce the batch size in the actual training process.
524
533
- `actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu`: Batch size for one GPU in one forward pass.
- `actor_rollout_ref.actor.checkpoint`: Contents to be loaded and saved. With 'hf_model' you can save whole model as hf format; now only use sharded model checkpoint to save space.
526
538
- `actor_rollout_ref.actor.optim.lr`: Learning rate for actor model.
527
539
- `actor_rollout_ref.actor.optim.lr_warmup_steps_ratio`: Ratio of warmup steps for learning rate.
528
540
- `actor_rollout_ref.actor.optim.warmup_style`: Warmup style for learning rate.
Copy file name to clipboardExpand all lines: examples/ppo_countdown/train_countdown.yaml
+2-1Lines changed: 2 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,8 @@ actor_rollout_ref:
15
15
shuffle: False
16
16
ulysses_sequence_parallel_size: 1# sp size
17
17
checkpoint:
18
-
contents: ['model', 'hf_model', 'optimizer', 'extra'] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
18
+
load_contents: ['model', 'hf_model', 'optimizer', 'extra'] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
Copy file name to clipboardExpand all lines: tests/template/verl_config.yaml
+12-2Lines changed: 12 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -15,8 +15,11 @@ actor_rollout_ref:
15
15
ppo_epochs: 1
16
16
shuffle: False
17
17
ulysses_sequence_parallel_size: 1# sp size
18
+
entropy_from_logits_with_chunking: false
19
+
entropy_checkpointing: false
18
20
checkpoint:
19
-
contents: ["model", "optimizer", "extra"] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
21
+
load_contents: ['model', 'optimizer', 'extra'] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
22
+
save_contents: ['model', 'optimizer', 'extra']
20
23
optim:
21
24
lr: 1e-6
22
25
lr_warmup_steps_ratio: 0.# the total steps will be injected during runtime
contents: ["model", "optimizer", "extra"] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
85
+
load_contents: ['model', 'optimizer', 'extra'] # with 'hf_model' you can save whole model as hf format, now only use sharded model checkpoint to save space
0 commit comments