You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Please do not use `-p/--python` and instead allow `uv venv` to read it from `.python-version`.
117
+
> This ensures that the version of python used is always what we prescribe.
117
118
118
-
# If working outside a container, it can help to build flash-attn and warm the
119
-
# uv cache before your first run. The NeMo RL Dockerfile will warm the uv cache
120
-
# with flash-attn. See https://docs.nvidia.com/nemo/rl/latest/docker.html for
121
-
# instructions if you are looking for the NeMo RL container.
119
+
If working outside a container, it can help to build [flash-attn](https://github.com/Dao-AILab/flash-attention) and warm the uv cache before your first run.
120
+
```sh
122
121
bash tools/build-flash-attn-in-uv-cache.sh
123
-
# If sucessful, you should see "✅ flash-attn successfully added to uv cache"
124
-
125
-
# If you cannot install at the system level, you can install for your user with
126
-
# pip install --user uv
127
-
128
-
# Use `uv run` to launch all commands. It handles pip installing implicitly and
129
-
# ensures your environment is up to date with our lock file.
130
-
131
-
# Note that it is not recommended to activate the venv and instead use `uv run` since
132
-
# it ensures consistent environment usage across different shells and sessions.
133
-
# Example: uv run python examples/run_grpo_math.py
134
122
```
135
-
136
-
**Important Notes:**
137
-
138
-
- Use the `uv run <command>` to execute scripts within the managed environment. This helps maintain consistency across different shells and sessions.
139
-
- Ensure you have the necessary CUDA drivers and PyTorch installed compatible with your hardware.
140
-
- On the first install, `flash-attn` can take a while to install (~45min with 48 CPU hyperthreads). After it is built once, it is cached in your `uv`'s cache dir making subsequent installs much quicker.
141
-
- If you update your environment in `pyproject.toml`, it is necessary to force a rebuild of the virtual environments by setting `NRL_FORCE_REBUILD_VENVS=true` next time you launch a run.
142
-
-**Reminder**: Don't forget to set your `HF_HOME`, `WANDB_API_KEY`, and `HF_DATASETS_CACHE` (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
123
+
> [!NOTE]
124
+
> On the first install, `flash-attn` can take a while to install (~45min with 48 CPU hyperthreads). After it is built once, it is cached in your uv's cache dir making subsequent installs much quicker.
125
+
126
+
> [!TIP]
127
+
> The NeMo RL Dockerfile will warm the uv cache with flash-attn.
128
+
> See https://docs.nvidia.com/nemo/rl/latest/docker.html for instructions if you are looking for the NeMo RL container.
129
+
130
+
If sucessful, you should see `✅ flash-attn successfully added to uv cache`.
131
+
132
+
Use `uv run` to launch all commands. It handles pip installing implicitly and ensures your environment is up to date with our lock file.
133
+
> [!NOTE]
134
+
> - It is not recommended to activate the `venv`, and you should use `uv run <command>` instead to execute scripts within the managed environment.
135
+
> This ensures consistent environment usage across different shells and sessions. Example: `uv run python examples/run_grpo_math.py`
136
+
> - Ensure you have the necessary CUDA drivers and PyTorch installed compatible with your hardware.
137
+
> - If you update your environment in `pyproject.toml`, it is necessary to force a rebuild of the virtual environments by setting `NRL_FORCE_REBUILD_VENVS=true` next time you launch a run.
138
+
> -**Reminder**: Don't forget to set your `HF_HOME`, `WANDB_API_KEY`, and `HF_DATASETS_CACHE` (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.
0 commit comments