Skip to content

Commit c49571e

Browse files
authored
fix: update the instructions for multi-node setup; change the title f… (#78)
Signed-off-by: Parth Chadha <[email protected]>
1 parent 6a324e8 commit c49571e

File tree

1 file changed

+2
-4
lines changed

1 file changed

+2
-4
lines changed

README.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from 1 GPU to 1000s, and from Tiny to >100B Parameters
1+
# Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to >100B Parameters, scaling from 1 GPU to 100s
22

33
<!-- markdown all in one -->
44
- [Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from 1 GPU to 1000s, and from Tiny to \>100B Parameters](#nemo-reinforcer-a-scalable-and-efficient-post-training-library-for-models-ranging-from-1-gpu-to-1000s-and-from-tiny-to-100b-parameters)
@@ -143,16 +143,14 @@ uv run python examples/run_grpo_math.py \
143143

144144
#### Multi-node
145145

146-
For the general multi-node setup, refer to the [SFT multi-node](#multi-node) documentation. The only thing that differs from SFT is the `COMMAND`:
147-
148146
```sh
149147
# Run from the root of NeMo-Reinforcer repo
150148
NUM_ACTOR_NODES=2
151149
# Add a timestamp to make each job name unique
152150
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
153151

154152
# grpo_math_8b uses Llama-3.1-8B-Instruct model
155-
COMMAND="uv pip install -e .; uv run ./examples/run_grpo_math.py --config examples/configs/grpo_math_8B.yaml cluster.num_nodes=2 checkpointing.checkpoint_dir='results/llama8b_2nodes' policy.train_global_batch_size=64 logger.wandb_enabled=True logger.wandb.name='grpo-llama8b_math'" \
153+
COMMAND="uv pip install -e .; uv run ./examples/run_grpo_math.py --config examples/configs/grpo_math_8B.yaml cluster.num_nodes=2 checkpointing.checkpoint_dir='results/llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='grpo-llama8b_math'" \
156154
RAY_DEDUP_LOGS=0 \
157155
UV_CACHE_DIR=YOUR_UV_CACHE_DIR \
158156
CONTAINER=YOUR_CONTAINER \

0 commit comments

Comments
 (0)