Skip to content

Commit 691b11f

Browse files
lukasgdRMeli
andauthored
Update docs/guides/mlp_tutorials/llm-nanotron-training.md
Co-authored-by: Rocco Meli <[email protected]>
1 parent fb1629f commit 691b11f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/guides/mlp_tutorials/llm-nanotron-training.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -341,7 +341,7 @@ srun -ul --environment=./ngc-nanotron-24.04.toml bash -c "
341341
- If instead of downloading a dataset from HuggingFace you want to re-use one managed by a colleague, please refer to the [storage guide][ref-guides-storage-sharing] for instructions on dataset sharing.
342342
- If you have a [wandb API key](https://docs.wandb.ai/guides/track/environment-variables/) and want to synchronize the training run, be sure to set the `WANDB_API_KEY` variable. Alternatively, `wandb` can write log data to the distributed filesystem with `WANDB_MODE=of​f​line` so that it can be uploaded with `wandb sync` (cf. [Weights & Biases docs](https://docs.wandb.ai/support/run_wandb_offline/)) after the training run has finished.
343343
344-
!!! warning "torchrun with virtual environments"
344+
!!! warning "`torchrun` with virtual environments"
345345
When using a virtual environment on top of a base image with PyTorch, always replace `torchrun` with `python -m torch.distributed.run` to pick up the correct Python environment. Otherwise, the system Python environment will be used and virtual environment packages not available. If not using virtual environments such as with a self-contained PyTorch container, `torchrun` is equivalent to `python -m torch.distributed.run`.
346346
347347
!!! note "Using srun instead of torchrun"

0 commit comments

Comments
 (0)