You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/software/ml/pytorch.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -387,9 +387,9 @@ srun bash -c "
387
387
5. Set the Trition home to a local path (e.g. `/dev/shm`) to avoid writing to the (distributed) file system.
388
388
This is important for performance, as writing to the Lustre file system can be slow due to the amount of small files and potentially many processes accessing it.
389
389
6. Disable GPU support in MPICH, as it [can lead to deadlocks](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/mpi.html#inter-gpu-communication-with-cuda-aware-mpi) when using together with nccl.
390
-
6. Avoid writing JITed binaries to the (distributed) file system, which could lead to performance issues.
391
-
7. These variables should always be set for correctness and optimal performance when using NCCL, see [the detailed explanation][ref-communication-nccl].
392
-
8.`RANK` and `LOCAL_RANK` are set per-process by the SLURM job launcher.
390
+
7. Avoid writing JITed binaries to the (distributed) file system, which could lead to performance issues.
391
+
8. These variables should always be set for correctness and optimal performance when using NCCL, see [the detailed explanation][ref-communication-nccl].
392
+
9.`RANK` and `LOCAL_RANK` are set per-process by the SLURM job launcher.
393
393
10. Activate the virtual environment created on top of the uenv (if any).
394
394
Please follow the guidelines for [python virtual environments with uenv][ref-guides-storage-venv] to enhance scalability and reduce load times.
0 commit comments