Skip to content

Commit 0fae6bc

Browse files
authored
feat: Updated Name to NeMo RL (#265)
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
1 parent 34cae3a commit 0fae6bc

File tree

137 files changed

+522
-521
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

137 files changed

+522
-521
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,10 @@ List issues that this PR closes ([syntax](https://docs.github.com/en/issues/trac
1515

1616
# Before your PR is "Ready for review"
1717
**Pre checks**:
18-
- [ ] Make sure you read and followed [Contributor guidelines](/NVIDIA/reinforcer/blob/main/CONTRIBUTING.md)
18+
- [ ] Make sure you read and followed [Contributor guidelines](/NVIDIA/nemo-rl/blob/main/CONTRIBUTING.md)
1919
- [ ] Did you write any new necessary tests?
20-
- [ ] Did you run the unit tests and functional tests locally? Visit our [Testing Guide](/NVIDIA/reinforcer/blob/main/docs/testing.md) for how to run tests
21-
- [ ] Did you add or update any necessary documentation? Visit our [Document Development Guide](/NVIDIA/reinforcer/blob/main/docs/documentation.md) for how to write, build and test the docs.
20+
- [ ] Did you run the unit tests and functional tests locally? Visit our [Testing Guide](/NVIDIA/nemo-rl/blob/main/docs/testing.md) for how to run tests
21+
- [ ] Did you add or update any necessary documentation? Visit our [Document Development Guide](/NVIDIA/nemo-rl/blob/main/docs/documentation.md) for how to write, build and test the docs.
2222

2323
# Additional Information
2424
* ...

.github/workflows/_run_test.yml

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ jobs:
6868
6969
- name: Docker pull image
7070
run: |
71-
docker pull nemoci.azurecr.io/nemo_reinforcer_container:${{ github.run_id }}
71+
docker pull nemoci.azurecr.io/nemo_rl_container:${{ github.run_id }}
7272
7373
- name: Checkout repository
7474
uses: actions/checkout@v4
@@ -80,22 +80,22 @@ jobs:
8080
docker run --rm -u root -d --name nemo_container_${{ github.run_id }} --runtime=nvidia --gpus all --shm-size=64g \
8181
--env TRANSFORMERS_OFFLINE=0 \
8282
--env HYDRA_FULL_ERROR=1 \
83-
--env HF_HOME=/home/TestData/reinforcer/hf_home \
84-
--env HF_DATASETS_CACHE=/home/TestData/reinforcer/hf_datasets_cache \
85-
--env REINFORCER_REPO_DIR=/opt/reinforcer \
83+
--env HF_HOME=/home/TestData/nemo-rl/hf_home \
84+
--env HF_DATASETS_CACHE=/home/TestData/nemo-rl/hf_datasets_cache \
85+
--env NEMO_RL_REPO_DIR=/opt/nemo-rl \
8686
--env HF_TOKEN \
87-
--volume $GITHUB_WORKSPACE:/opt/reinforcer \
87+
--volume $GITHUB_WORKSPACE:/opt/nemo-rl \
8888
--volume $GITHUB_ACTION_DIR:$GITHUB_ACTION_DIR \
89-
--volume /mnt/datadrive/TestData/reinforcer/datasets:/opt/reinforcer/datasets:ro \
90-
--volume /mnt/datadrive/TestData/reinforcer/checkpoints:/home/TestData/reinforcer/checkpoints:ro \
91-
--volume /mnt/datadrive/TestData/reinforcer/hf_home/hub:/home/TestData/reinforcer/hf_home/hub \
92-
--volume /mnt/datadrive/TestData/reinforcer/hf_datasets_cache:/home/TestData/reinforcer/hf_datasets_cache \
93-
nemoci.azurecr.io/nemo_reinforcer_container:${{ github.run_id }} \
89+
--volume /mnt/datadrive/TestData/nemo-rl/datasets:/opt/nemo-rl/datasets:ro \
90+
--volume /mnt/datadrive/TestData/nemo-rl/checkpoints:/home/TestData/nemo-rl/checkpoints:ro \
91+
--volume /mnt/datadrive/TestData/nemo-rl/hf_home/hub:/home/TestData/nemo-rl/hf_home/hub \
92+
--volume /mnt/datadrive/TestData/nemo-rl/hf_datasets_cache:/home/TestData/nemo-rl/hf_datasets_cache \
93+
nemoci.azurecr.io/nemo_rl_container:${{ github.run_id }} \
9494
bash -c "sleep $(( ${{ inputs.TIMEOUT }} * 60 + 60 ))"
9595
9696
- name: Run unit tests
9797
run: |
98-
docker exec nemo_container_${{ github.run_id }} git config --global --add safe.directory /opt/reinforcer
98+
docker exec nemo_container_${{ github.run_id }} git config --global --add safe.directory /opt/nemo-rl
9999
docker exec nemo_container_${{ github.run_id }} bash -eux -o pipefail -c "
100100
# This is needed since we create virtualenvs in the workspace, so this allows it to be cleaned up if necessary
101101
umask 000
@@ -141,6 +141,6 @@ jobs:
141141
if: always()
142142
run: |
143143
# Ensure any added files in the mounted directory are owned by the runner user to allow it to clean up
144-
docker exec nemo_container_${{ github.run_id }} bash -c "find /opt/reinforcer -path '/opt/reinforcer/datasets' -prune -o -exec chown $(id -u):$(id -g) {} +"
144+
docker exec nemo_container_${{ github.run_id }} bash -c "find /opt/nemo-rl -path '/opt/nemo-rl/datasets' -prune -o -exec chown $(id -u):$(id -g) {} +"
145145
docker container stop nemo_container_${{ github.run_id }} || true
146146
docker container rm nemo_container_${{ github.run_id }} || true

.github/workflows/cicd-main.yml

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
14-
name: "CICD Reinforcer"
14+
name: "CICD NeMo RL"
1515

1616
on:
1717
pull_request:
@@ -136,12 +136,12 @@ jobs:
136136
uses: NVIDIA/NeMo-FW-CI-templates/.github/workflows/_build_container.yml@v0.22.7
137137
with:
138138
build-ref: ${{ github.sha }}
139-
image-name: nemo_reinforcer_container
139+
image-name: nemo_rl_container
140140
dockerfile: docker/Dockerfile
141-
image-label: nemo-reinforcer
141+
image-label: nemo-rl
142142
build-args: |
143143
MAX_JOBS=32
144-
REINFORCER_COMMIT=${{ github.sha }}
144+
NEMO_RL_COMMIT=${{ github.sha }}
145145
146146
tests:
147147
name: Tests
@@ -152,21 +152,21 @@ jobs:
152152
RUNNER: self-hosted-azure
153153
TIMEOUT: 60
154154
UNIT_TEST_SCRIPT: |
155-
cd /opt/reinforcer
155+
cd /opt/nemo-rl
156156
if [[ "${{ needs.pre-flight.outputs.test_level }}" =~ ^(L0|L1|L2)$ ]]; then
157157
uv run --no-sync bash -x ./tests/run_unit.sh
158158
else
159159
echo Skipping unit tests for docs-only level
160160
fi
161161
DOC_TEST_SCRIPT: |
162-
cd /opt/reinforcer/docs
162+
cd /opt/nemo-rl/docs
163163
if [[ "${{ needs.pre-flight.outputs.test_level }}" =~ ^(docs|L0|L1|L2)$ ]]; then
164164
uv run --no-sync sphinx-build -b doctest . _build/doctest
165165
else
166166
echo Skipping doc tests for level ${{ needs.pre-flight.outputs.test_level }}
167167
fi
168168
FUNCTIONAL_TEST_SCRIPT: |
169-
cd /opt/reinforcer
169+
cd /opt/nemo-rl
170170
if [[ "${{ needs.pre-flight.outputs.test_level }}" =~ ^(L1|L2)$ ]]; then
171171
uv run --no-sync bash ./tests/functional/sft.sh
172172
uv run --no-sync bash ./tests/functional/grpo.sh
@@ -177,7 +177,7 @@ jobs:
177177
fi
178178
# TODO: enable once we have convergence tests in CI
179179
#CONVERGENCE_TEST_SCRIPT: |
180-
# cd /opt/reinforcer
180+
# cd /opt/nemo-rl
181181
# if [[ "${{ needs.pre-flight.outputs.test_level }}" =~ ^(L2)$ ]]; then
182182
# echo "Running convergence tests"
183183
# # Add your convergence test commands here
@@ -186,7 +186,7 @@ jobs:
186186
# echo "Skipping convergence tests for level ${{ needs.pre-flight.outputs.test_level }}"
187187
# fi
188188
AFTER_SCRIPT: |
189-
cd /opt/reinforcer
189+
cd /opt/nemo-rl
190190
cat <<EOF | tee -a $GITHUB_STEP_SUMMARY
191191
# Test Summary for level: ${{ needs.pre-flight.outputs.test_level }}
192192

.github/workflows/release-freeze.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,8 @@ jobs:
3636
code-freeze:
3737
uses: NVIDIA/NeMo-FW-CI-templates/.github/workflows/_code_freeze.yml@v0.22.5
3838
with:
39-
library-name: NeMo-reinforcer
40-
python-package: nemo_reinforcer
39+
library-name: NeMo-RL
40+
python-package: nemo_rl
4141
release-type: ${{ inputs.release-type }}
4242
freeze-commit: ${{ inputs.freeze-commit }}
4343
dry-run: ${{ inputs.dry-run }}

.github/workflows/release.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
14-
name: "Release Reinforcer"
14+
name: "Release NeMo-RL"
1515

1616
on:
1717
workflow_dispatch:
@@ -35,9 +35,9 @@ jobs:
3535
uses: NVIDIA/NeMo-FW-CI-templates/.github/workflows/_release_library.yml@v0.22.6
3636
with:
3737
release-ref: ${{ inputs.release-ref }}
38-
python-package: nemo_reinforcer
38+
python-package: nemo_rl
3939
python-version: "3.11"
40-
library-name: NeMo-Reinforcer
40+
library-name: NeMo-RL
4141
dry-run: ${{ inputs.dry-run }}
4242
version-bump-branch: ${{ inputs.version-bump-branch }}
4343
secrets:

CONTRIBUTING.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
1-
# Contributing To Nemo-Reinforcer
1+
# Contributing To Nemo-RL
22

3-
Thanks for your interest in contributing to Nemo-Reinforcer!
3+
Thanks for your interest in contributing to Nemo-RL!
44

55
## Setting Up
66

77
### Development Environment
88

99
1. **Build and run the Docker container**:
1010
```bash
11-
docker buildx build -t nemo-reinforcer -f Dockerfile .
12-
# Run the container with your local nemo-reinforcer directory mounted
13-
docker run -it --gpus all -v /path/to/nemo-reinforcer:/workspace/nemo-reinforcer nemo-reinforcer
11+
docker buildx build -t nemo-rl -f Dockerfile .
12+
# Run the container with your local nemo-rl directory mounted
13+
docker run -it --gpus all -v /path/to/nemo-rl:/workspace/nemo-rl nemo-rl
1414
```
1515

1616
## Making Changes
@@ -19,7 +19,7 @@ docker run -it --gpus all -v /path/to/nemo-reinforcer:/workspace/nemo-reinforcer
1919

2020
#### Before You Start: Install pre-commit
2121

22-
From the [`nemo-reinforcer` root directory](.), run:
22+
From the [`nemo-rl` root directory](.), run:
2323
```bash
2424
python3 -m pip install pre-commit
2525
pre-commit install
@@ -31,8 +31,8 @@ We follow a direct clone and branch workflow for now:
3131

3232
1. Clone the repository directly:
3333
```bash
34-
git clone https://github.com/NVIDIA/reinforcer
35-
cd reinforcer
34+
git clone https://github.com/NVIDIA/nemo-rl
35+
cd nemo-rl
3636
```
3737

3838
2. Create a new branch for your changes:
@@ -69,7 +69,7 @@ This ensures that all significant changes are well-thought-out and properly docu
6969
1. **User Adoption**: Helps users understand how to effectively use the library's features in their projects
7070
2. **Developer Extensibility**: Enables developers to understand the internal architecture and implementation details, making it easier to modify, extend, or adapt the code for their specific use cases
7171

72-
Quality documentation is essential for both the usability of Nemo-Reinforcer and its ability to be customized by the community.
72+
Quality documentation is essential for both the usability of Nemo-RL and its ability to be customized by the community.
7373

7474
## Code Quality
7575

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
# Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to >100B Parameters, scaling from 1 GPU to 100s
1+
# Nemo-RL: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to >100B Parameters, scaling from 1 GPU to 100s
22

33
<!-- markdown all in one -->
4-
- [Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to \>100B Parameters, scaling from 1 GPU to 100s](#nemo-reinforcer-a-scalable-and-efficient-post-training-library-for-models-ranging-from-tiny-to-100b-parameters-scaling-from-1-gpu-to-100s)
4+
- [Nemo-RL: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to \>100B Parameters, scaling from 1 GPU to 100s](#nemo-rl-a-scalable-and-efficient-post-training-library-for-models-ranging-from-tiny-to-100b-parameters-scaling-from-1-gpu-to-100s)
55
- [Features](#features)
66
- [Prerequisuites](#prerequisuites)
77
- [Quick start](#quick-start)
@@ -17,7 +17,7 @@
1717
- [Multi-node](#multi-node-2)
1818
- [Cluster Start](#cluster-start)
1919

20-
**Nemo-Reinforcer** is a scalable and efficient post-training library designed for models ranging from 1 GPU to thousands, and from tiny to over 100 billion parameters.
20+
**Nemo-RL** is a scalable and efficient post-training library designed for models ranging from 1 GPU to thousands, and from tiny to over 100 billion parameters.
2121

2222
What you can expect:
2323

@@ -52,8 +52,8 @@ What you can expect:
5252

5353
Clone **NeMo RL**
5454
```sh
55-
git clone git@github.com:NVIDIA/reinforcer.git
56-
cd reinforcer
55+
git clone git@github.com:NVIDIA/nemo-rl.git
56+
cd nemo-rl
5757
```
5858

5959
Install `uv`
@@ -111,7 +111,7 @@ uv run python examples/run_grpo_math.py \
111111
#### Multi-node
112112

113113
```sh
114-
# Run from the root of NeMo-Reinforcer repo
114+
# Run from the root of NeMo-RL repo
115115
NUM_ACTOR_NODES=2
116116

117117
# grpo_math_8b uses Llama-3.1-8B-Instruct model
@@ -131,7 +131,7 @@ sbatch \
131131
##### GRPO Qwen2.5-32B
132132

133133
```sh
134-
# Run from the root of NeMo-Reinforcer repo
134+
# Run from the root of NeMo-RL repo
135135
NUM_ACTOR_NODES=16
136136

137137
# Download Qwen before the job starts to avoid spending time downloading during the training loop
@@ -187,7 +187,7 @@ Refer to `examples/configs/sft.yaml` for a full list of parameters that can be o
187187
#### Multi-node
188188

189189
```sh
190-
# Run from the root of NeMo-Reinforcer repo
190+
# Run from the root of NeMo-RL repo
191191
NUM_ACTOR_NODES=2
192192

193193
COMMAND="uv run ./examples/run_sft.py --config examples/configs/sft.yaml cluster.num_nodes=2 cluster.gpus_per_node=8 checkpointing.checkpoint_dir='results/sft_llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='sft-llama8b'" \
@@ -244,7 +244,7 @@ Refer to [dpo.yaml](examples/configs/dpo.yaml) for a full list of parameters tha
244244
For distributed DPO training across multiple nodes, modify the following script for your use case:
245245

246246
```sh
247-
# Run from the root of NeMo-Reinforcer repo
247+
# Run from the root of NeMo-RL repo
248248
## number of nodes to use for your job
249249
NUM_ACTOR_NODES=2
250250

docker/Dockerfile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,12 @@ WORKDIR /opt/reinforcer
2222
# First copy only the dependency files
2323
COPY --chown=ray --chmod=755 pyproject.toml uv.lock ./
2424

25-
ENV UV_PROJECT_ENVIRONMENT=/opt/reinforcer_venv
26-
ENV VIRTUAL_ENV=/opt/reinforcer_venv
25+
ENV UV_PROJECT_ENVIRONMENT=/opt/nemo_rl_venv
26+
ENV VIRTUAL_ENV=/opt/nemo_rl_venv
2727

2828
# Create and activate virtual environment
2929
RUN <<"EOF"
30-
uv venv /opt/reinforcer_venv
30+
uv venv /opt/nemo_rl_venv
3131
# uv sync has a more reliable resolver than simple uv pip install which can fail
3232

3333
# Sync each training + inference backend one at a time (since they may conflict)
@@ -38,7 +38,7 @@ uv sync --locked --extra vllm --no-install-project
3838
uv sync --locked --all-groups --no-install-project
3939
EOF
4040

41-
ENV PATH="/opt/reinforcer_venv/bin:$PATH"
41+
ENV PATH="/opt/nemo_rl_venv/bin:$PATH"
4242

4343
# The ray images automatically activate the anaconda venv. We will
4444
# comment this out of the .bashrc to give the same UX between docker

docs/adding-new-models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Adding New Models
22

3-
This guide outlines how to integrate and validate a new model within **NeMo-Reinforcer**. Each new model must pass a standard set of compatibility tests before being considered ready to be used in RL pipelines.
3+
This guide outlines how to integrate and validate a new model within **NeMo-RL**. Each new model must pass a standard set of compatibility tests before being considered ready to be used in RL pipelines.
44

55
## Importance of Log Probability Consistency in Training and Inference
66

@@ -120,4 +120,4 @@ When validating your model, you should analyze the results across different conf
120120

121121
---
122122

123-
By following these validation steps and ensuring your model's outputs remain consistent across backends, you can confirm that your new model meets **NeMo-Reinforcer**'s requirements.
123+
By following these validation steps and ensuring your model's outputs remain consistent across backends, you can confirm that your new model meets **NeMo-RL**'s requirements.

docs/cluster.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
### Batched Job Submission
1313

1414
```sh
15-
# Run from the root of NeMo-Reinforcer repo
15+
# Run from the root of NeMo-RL repo
1616
NUM_ACTOR_NODES=1 # Total nodes requested (head is colocated on ray-worker-0)
1717

1818
COMMAND="uv run ./examples/run_grpo_math.py" \
@@ -43,12 +43,12 @@ tail -f 1980204-logs/ray-driver.log
4343
### Interactive Launching
4444

4545
:::{tip}
46-
A key advantage of running interactively on the head node is the ability to execute multiple multi-node jobs without needing to requeue in the SLURM job queue. This means during debugging sessions, you can avoid submitting a new `sbatch` command each time and instead debug and re-submit your Reinforcer job directly from the interactive session.
46+
A key advantage of running interactively on the head node is the ability to execute multiple multi-node jobs without needing to requeue in the SLURM job queue. This means during debugging sessions, you can avoid submitting a new `sbatch` command each time and instead debug and re-submit your NeMo-RL job directly from the interactive session.
4747
:::
4848

4949
To run interactively, launch the same command as the [Batched Job Submission](#batched-job-submission) except omit the `COMMAND` line:
5050
```sh
51-
# Run from the root of NeMo-Reinforcer repo
51+
# Run from the root of NeMo-RL repo
5252
NUM_ACTOR_NODES=1 # Total nodes requested (head is colocated on ray-worker-0)
5353

5454
CONTAINER=YOUR_CONTAINER \

0 commit comments

Comments
 (0)