Skip to content

Commit 98b7a90

Browse files
authored
docs: update docs everywhere to remove uv pip install which isn't reliable (#217)
Signed-off-by: Terry Kong <terryk@nvidia.com>
1 parent da191b4 commit 98b7a90

File tree

8 files changed

+38
-55
lines changed

8 files changed

+38
-55
lines changed

CONTRIBUTING.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,6 @@ docker buildx build -t nemo-reinforcer -f Dockerfile .
1313
docker run -it --gpus all -v /path/to/nemo-reinforcer:/workspace/nemo-reinforcer nemo-reinforcer
1414
```
1515

16-
2. **Install the package in development mode**:
17-
```bash
18-
cd /workspace/nemo-reinforcer
19-
pip install -e .
20-
```
21-
2216
## Making Changes
2317

2418
### Workflow: Clone and Branch (No Fork Required)

README.md

Lines changed: 12 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
<!-- markdown all in one -->
44
- [Nemo-Reinforcer: A Scalable and Efficient Post-Training Library for Models Ranging from tiny to \>100B Parameters, scaling from 1 GPU to 100s](#nemo-reinforcer-a-scalable-and-efficient-post-training-library-for-models-ranging-from-tiny-to-100b-parameters-scaling-from-1-gpu-to-100s)
55
- [Features](#features)
6-
- [Installation](#installation)
6+
- [Prerequisuites](#prerequisuites)
77
- [Quick start](#quick-start)
88
- [SFT](#sft)
99
- [Single Node](#single-node)
@@ -38,28 +38,26 @@ What you can expect:
3838
- 🔜 **Environment Isolation** - Dependency isolation between components
3939
- 🔜 **DPO Algorithm** - Direct Preference Optimization for alignment
4040

41-
## Installation
41+
## Prerequisuites
4242

4343
```sh
44-
# For faster setup we use `uv`
44+
# For faster setup and environment isolation, we use `uv`
4545
pip install uv
4646

47-
# Specify a virtual env that uses Python 3.12
48-
uv venv -p python3.12.9 .venv
49-
# Install NeMo-Reinforcer with vllm
50-
uv pip install -e .[vllm]
51-
# Install NeMo-Reinforcer with dev/test dependencies
52-
uv pip install -e '.[dev,test]'
47+
# If you cannot install at the system level, you can install for your user with
48+
# pip install --user uv
5349

54-
# Use uv run to launch any runs.
55-
# Note that it is recommended to not activate the venv and instead use `uv run` since
50+
# Use `uv run` to launch all commands. It handles pip installing implicitly and
51+
# ensures your environment is up to date with our lock file.
52+
53+
# Note that it is not recommended to activate the venv and instead use `uv run` since
5654
# it ensures consistent environment usage across different shells and sessions.
5755
# Example: uv run python examples/run_grpo_math.py
5856
```
5957

6058
## Quick start
6159

62-
**Reminder**: Don't forget to set your HF_HOME and WANDB_API_KEY (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
60+
**Reminder**: Don't forget to set your `HF_HOME`, `WANDB_API_KEY`, and `HF_DATASETS_CACHE` (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
6361

6462
### SFT
6563

@@ -91,21 +89,14 @@ Refer to `examples/configs/sft.yaml` for a full list of parameters that can be o
9189

9290
For distributed training across multiple nodes:
9391

94-
Set `UV_CACHE_DIR` to a directory that can be read from all workers before running any uv run command.
95-
96-
```sh
97-
export UV_CACHE_DIR=/path/that/all/workers/can/access/uv_cache
98-
```
99-
10092
```sh
10193
# Run from the root of NeMo-Reinforcer repo
10294
NUM_ACTOR_NODES=2
10395
# Add a timestamp to make each job name unique
10496
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
10597

10698
# SFT experiment uses Llama-3.1-8B model
107-
COMMAND="uv pip install -e .; uv run ./examples/run_sft.py --config examples/configs/sft.yaml cluster.num_nodes=2 cluster.gpus_per_node=8 checkpointing.checkpoint_dir='results/sft_llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='sft-llama8b'" \
108-
UV_CACHE_DIR=YOUR_UV_CACHE_DIR \
99+
COMMAND="uv run ./examples/run_sft.py --config examples/configs/sft.yaml cluster.num_nodes=2 cluster.gpus_per_node=8 checkpointing.checkpoint_dir='results/sft_llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='sft-llama8b'" \
109100
CONTAINER=YOUR_CONTAINER \
110101
MOUNTS="$PWD:$PWD" \
111102
sbatch \
@@ -159,8 +150,7 @@ NUM_ACTOR_NODES=2
159150
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
160151

161152
# grpo_math_8b uses Llama-3.1-8B-Instruct model
162-
COMMAND="uv pip install -e .; uv run ./examples/run_grpo_math.py --config examples/configs/grpo_math_8B.yaml cluster.num_nodes=2 checkpointing.checkpoint_dir='results/llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='grpo-llama8b_math'" \
163-
UV_CACHE_DIR=YOUR_UV_CACHE_DIR \
153+
COMMAND="uv run ./examples/run_grpo_math.py --config examples/configs/grpo_math_8B.yaml cluster.num_nodes=2 checkpointing.checkpoint_dir='results/llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='grpo-llama8b_math'" \
164154
CONTAINER=YOUR_CONTAINER \
165155
MOUNTS="$PWD:$PWD" \
166156
sbatch \

docs/cluster.md

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
- [Slurm](#slurm)
55
- [Batched Job Submission](#batched-job-submission)
66
- [Interactive Launching](#interactive-launching)
7+
- [Slurm UV\_CACHE\_DIR](#slurm-uv_cache_dir)
78
- [Kubernetes](#kubernetes)
89

910
## Slurm
@@ -14,7 +15,7 @@
1415
# Run from the root of NeMo-Reinforcer repo
1516
NUM_ACTOR_NODES=1 # Total nodes requested (head is colocated on ray-worker-0)
1617

17-
COMMAND="uv pip install -e .; uv run ./examples/run_grpo_math.py" \
18+
COMMAND="uv run ./examples/run_grpo_math.py" \
1819
CONTAINER=YOUR_CONTAINER \
1920
MOUNTS="$PWD:$PWD" \
2021
sbatch \
@@ -39,21 +40,6 @@ Make note of the the job submission number. Once the job begins you can track it
3940
tail -f 1980204-logs/ray-driver.log
4041
```
4142

42-
:::{note}
43-
`UV_CACHE_DIR` defaults to `$SLURM_SUBMIT_DIR/uv_cache` and is mounted to head and worker nodes
44-
to ensure fast `venv` creation.
45-
46-
If you would like to override it to somewhere else all head/worker nodes can access, you may set it
47-
via:
48-
49-
```sh
50-
...
51-
UV_CACHE_DIR=/path/that/all/workers/and/head/can/access \
52-
sbatch ... \
53-
ray.sub
54-
```
55-
:::
56-
5743
### Interactive Launching
5844

5945
:::{tip}
@@ -87,11 +73,27 @@ bash 1980204-attach.sh
8773
```
8874
Now that you are on the head node, you can launch the command like so:
8975
```sh
90-
uv venv .venv
91-
uv pip install -e .
9276
uv run ./examples/run_grpo_math.py
9377
```
9478

79+
### Slurm UV_CACHE_DIR
80+
81+
There several choices for `UV_CACHE_DIR` when using `ray.sub`:
82+
83+
1. (default) `UV_CACHE_DIR` defaults to `$SLURM_SUBMIT_DIR/uv_cache` when not specified the shell environment, and is mounted to head and worker nodes to serve as a persistent cache between runs.
84+
2. Use the warm uv cache from our docker images
85+
```sh
86+
...
87+
UV_CACHE_DIR=/home/ray/.cache/uv \
88+
sbatch ... \
89+
ray.sub
90+
```
91+
92+
(1) is more efficient in general since the cache is not ephemeral and is persisted run to run; but for users that
93+
don't want to persist the cache, you can use (2), which is just as performant as (1) if the `uv.lock` is
94+
covered by warmed cache.
95+
96+
9597
## Kubernetes
9698
9799
TBD

docs/guides/grpo.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ uv run examples/run_grpo_math.py --config <PATH TO YAML CONFIG> {overrides}
1212

1313
If not specified, `config` will default to [examples/configs/grpo.yaml](../../examples/configs/grpo_math_1B.yaml)
1414

15-
**Reminder**: Don't forget to set your HF_HOME and WANDB_API_KEY (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
15+
**Reminder**: Don't forget to set your HF_HOME, WANDB_API_KEY, and HF_DATASETS_CACHE (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
1616

1717
## Now, for the details:
1818

docs/guides/sft.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ uv run examples/run_sft.py \
2121
cluster.gpus_per_node=1 \
2222
logger.wandb.name="sft-dev-1-gpu"
2323
```
24-
**Reminder**: Don't forget to set your HF_HOME and WANDB_API_KEY (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
24+
**Reminder**: Don't forget to set your `HF_HOME`, `WANDB_API_KEY`, and `HF_DATASETS_CACHE` (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
2525

2626
## Datasets
2727

docs/testing.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,8 +103,6 @@ Functional tests may require multiple GPUs to run. See each script to understand
103103
Functional tests are located under `tests/functional/`.
104104

105105
```sh
106-
# Install the project and the test dependencies
107-
uv pip install -e '.[test]'
108106
# Run the functional test for sft
109107
uv run bash tests/functional/sft.sh
110108
```

nemo_reinforcer/models/generation/vllm.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -153,8 +153,8 @@ def __init__(
153153
self.SamplingParams = vllm.SamplingParams
154154
except ImportError:
155155
raise ImportError(
156-
"vLLM is not installed. Please install it with `pip install nemo-reinforcer[vllm]` "
157-
"or `pip install vllm --no-build-isolation` separately."
156+
f"vLLM is not installed. Please check that VllmGenerationWorker.DEFAULT_PY_EXECUTABLE covers the vllm dependency. "
157+
"If you are working interactively, you can install by running `uv sync --extra vllm` anywhere in the repo."
158158
)
159159
vllm_kwargs = self.cfg.get("vllm_kwargs", {}).copy()
160160

nemo_reinforcer/models/generation/vllm_backend.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,8 @@
1717
import vllm
1818
except ImportError:
1919
raise ImportError(
20-
"vLLM is not installed. Please install it with `pip install nemo-reinforcer[vllm]` "
21-
"or `pip install vllm` separately. This issue may also occur if worker is using incorrect "
22-
"py_executable."
20+
f"vLLM is not installed. Please check that VllmGenerationWorker.DEFAULT_PY_EXECUTABLE covers the vllm dependency. "
21+
"If you are working interactively, you can install by running `uv sync --extra vllm` anywhere in the repo."
2322
)
2423

2524

0 commit comments

Comments
 (0)