You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .github/PULL_REQUEST_TEMPLATE.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,10 +15,10 @@ List issues that this PR closes ([syntax](https://docs.github.com/en/issues/trac
15
15
16
16
# Before your PR is "Ready for review"
17
17
**Pre checks**:
18
-
-[ ] Make sure you read and followed [Contributor guidelines](/NVIDIA/reinforcer/blob/main/CONTRIBUTING.md)
18
+
-[ ] Make sure you read and followed [Contributor guidelines](/NVIDIA/nemo-rl/blob/main/CONTRIBUTING.md)
19
19
-[ ] Did you write any new necessary tests?
20
-
-[ ] Did you run the unit tests and functional tests locally? Visit our [Testing Guide](/NVIDIA/reinforcer/blob/main/docs/testing.md) for how to run tests
21
-
-[ ] Did you add or update any necessary documentation? Visit our [Document Development Guide](/NVIDIA/reinforcer/blob/main/docs/documentation.md) for how to write, build and test the docs.
20
+
-[ ] Did you run the unit tests and functional tests locally? Visit our [Testing Guide](/NVIDIA/nemo-rl/blob/main/docs/testing.md) for how to run tests
21
+
-[ ] Did you add or update any necessary documentation? Visit our [Document Development Guide](/NVIDIA/nemo-rl/blob/main/docs/documentation.md) for how to write, build and test the docs.
# Run the container with your local nemo-reinforcer directory mounted
13
-
docker run -it --gpus all -v /path/to/nemo-reinforcer:/workspace/nemo-reinforcer nemo-reinforcer
11
+
docker buildx build -t nemo-rl -f Dockerfile .
12
+
# Run the container with your local nemo-rl directory mounted
13
+
docker run -it --gpus all -v /path/to/nemo-rl:/workspace/nemo-rl nemo-rl
14
14
```
15
15
16
16
## Making Changes
@@ -19,7 +19,7 @@ docker run -it --gpus all -v /path/to/nemo-reinforcer:/workspace/nemo-reinforcer
19
19
20
20
#### Before You Start: Install pre-commit
21
21
22
-
From the [`nemo-reinforcer` root directory](.), run:
22
+
From the [`nemo-rl` root directory](.), run:
23
23
```bash
24
24
python3 -m pip install pre-commit
25
25
pre-commit install
@@ -31,8 +31,8 @@ We follow a direct clone and branch workflow for now:
31
31
32
32
1. Clone the repository directly:
33
33
```bash
34
-
git clone https://github.com/NVIDIA/reinforcer
35
-
cdreinforcer
34
+
git clone https://github.com/NVIDIA/nemo-rl
35
+
cdnemo-rl
36
36
```
37
37
38
38
2. Create a new branch for your changes:
@@ -69,7 +69,7 @@ This ensures that all significant changes are well-thought-out and properly docu
69
69
1.**User Adoption**: Helps users understand how to effectively use the library's features in their projects
70
70
2.**Developer Extensibility**: Enables developers to understand the internal architecture and implementation details, making it easier to modify, extend, or adapt the code for their specific use cases
71
71
72
-
Quality documentation is essential for both the usability of Nemo-Reinforcer and its ability to be customized by the community.
72
+
Quality documentation is essential for both the usability of Nemo-RL and its ability to be customized by the community.
Copy file name to clipboardExpand all lines: README.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
-
# Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to >100B Parameters, scaling from 1 GPU to 100s
1
+
# Nemo-RL: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to >100B Parameters, scaling from 1 GPU to 100s
2
2
3
3
<!-- markdown all in one -->
4
-
-[Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to \>100B Parameters, scaling from 1 GPU to 100s](#nemo-reinforcer-a-scalable-and-efficient-post-training-library-for-models-ranging-from-tiny-to-100b-parameters-scaling-from-1-gpu-to-100s)
4
+
-[Nemo-RL: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to \>100B Parameters, scaling from 1 GPU to 100s](#nemo-rl-a-scalable-and-efficient-post-training-library-for-models-ranging-from-tiny-to-100b-parameters-scaling-from-1-gpu-to-100s)
5
5
-[Features](#features)
6
6
-[Prerequisuites](#prerequisuites)
7
7
-[Quick start](#quick-start)
@@ -17,7 +17,7 @@
17
17
-[Multi-node](#multi-node-2)
18
18
-[Cluster Start](#cluster-start)
19
19
20
-
**Nemo-Reinforcer** is a scalable and efficient post-training library designed for models ranging from 1 GPU to thousands, and from tiny to over 100 billion parameters.
20
+
**Nemo-RL** is a scalable and efficient post-training library designed for models ranging from 1 GPU to thousands, and from tiny to over 100 billion parameters.
21
21
22
22
What you can expect:
23
23
@@ -52,8 +52,8 @@ What you can expect:
52
52
53
53
Clone **NeMo RL**
54
54
```sh
55
-
git clone git@github.com:NVIDIA/reinforcer.git
56
-
cdreinforcer
55
+
git clone git@github.com:NVIDIA/nemo-rl.git
56
+
cdnemo-rl
57
57
```
58
58
59
59
Install `uv`
@@ -111,7 +111,7 @@ uv run python examples/run_grpo_math.py \
111
111
#### Multi-node
112
112
113
113
```sh
114
-
# Run from the root of NeMo-Reinforcer repo
114
+
# Run from the root of NeMo-RL repo
115
115
NUM_ACTOR_NODES=2
116
116
117
117
# grpo_math_8b uses Llama-3.1-8B-Instruct model
@@ -131,7 +131,7 @@ sbatch \
131
131
##### GRPO Qwen2.5-32B
132
132
133
133
```sh
134
-
# Run from the root of NeMo-Reinforcer repo
134
+
# Run from the root of NeMo-RL repo
135
135
NUM_ACTOR_NODES=16
136
136
137
137
# Download Qwen before the job starts to avoid spending time downloading during the training loop
@@ -187,7 +187,7 @@ Refer to `examples/configs/sft.yaml` for a full list of parameters that can be o
Copy file name to clipboardExpand all lines: docs/adding-new-models.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Adding New Models
2
2
3
-
This guide outlines how to integrate and validate a new model within **NeMo-Reinforcer**. Each new model must pass a standard set of compatibility tests before being considered ready to be used in RL pipelines.
3
+
This guide outlines how to integrate and validate a new model within **NeMo-RL**. Each new model must pass a standard set of compatibility tests before being considered ready to be used in RL pipelines.
4
4
5
5
## Importance of Log Probability Consistency in Training and Inference
6
6
@@ -120,4 +120,4 @@ When validating your model, you should analyze the results across different conf
120
120
121
121
---
122
122
123
-
By following these validation steps and ensuring your model's outputs remain consistent across backends, you can confirm that your new model meets **NeMo-Reinforcer**'s requirements.
123
+
By following these validation steps and ensuring your model's outputs remain consistent across backends, you can confirm that your new model meets **NeMo-RL**'s requirements.
A key advantage of running interactively on the head node is the ability to execute multiple multi-node jobs without needing to requeue in the SLURM job queue. This means during debugging sessions, you can avoid submitting a new `sbatch` command each time and instead debug and re-submit your Reinforcer job directly from the interactive session.
46
+
A key advantage of running interactively on the head node is the ability to execute multiple multi-node jobs without needing to requeue in the SLURM job queue. This means during debugging sessions, you can avoid submitting a new `sbatch` command each time and instead debug and re-submit your NeMo-RL job directly from the interactive session.
47
47
:::
48
48
49
49
To run interactively, launch the same command as the [Batched Job Submission](#batched-job-submission) except omit the `COMMAND` line:
50
50
```sh
51
-
# Run from the root of NeMo-Reinforcer repo
51
+
# Run from the root of NeMo-RL repo
52
52
NUM_ACTOR_NODES=1 # Total nodes requested (head is colocated on ray-worker-0)
0 commit comments