Skip to content

Commit bf0c26c

Browse files
committed
Update MPS links in GH200 slurm docs
1 parent 4f43fdc commit bf0c26c

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

docs/tools/slurm.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ The following sections will provide detailed guidance on how to use SLURM to req
2323
The [GH200 nodes on Alps][gh200-node] have four GPUs per node, and SLURM job submissions must be configured appropriately to best make use of the resources.
2424
Applications that can saturate the GPUs with a single process per GPU should generally prefer this mode.
2525
[Configuring SLURM jobs to use a single GPU per rank][gh200-slurm-single-rank-per-gpu] is also the most straightforward setup.
26-
Some applications perform badly with a single rank per GPU, and require use of [NVIDIA's Multi-Process-Service (MPS)](https://docs.nvidia.com/deploy/mps/index.html) to oversubscribe GPUs with multiple ranks per GPU.
26+
Some applications perform badly with a single rank per GPU, and require use of [NVIDIA's Multi-Process Service (MPS)] to oversubscribe GPUs with multiple ranks per GPU.
2727

2828
The best SLURM configuration is application- and workload-specific, so it is worth testing which works best in your particular case.
2929
See [Scientific Applications][sciapps] for information about recommended application-specific SLURM configurations.
@@ -62,7 +62,7 @@ Omitting the `--gpus-per-task` flag will lead to all ranks on the node using the
6262

6363
Using multiple ranks per GPU can improve performance e.g. of applications that don't generate enough work for a GPU using a single rank, or ones that scale badly to all 72 cores of the Grace CPU.
6464
In these cases SLURM jobs must be configured to assign multiple ranks to a single GPU.
65-
This is best done using [MPS](https://docs.nvidia.com/deploy/mps/index.html).
65+
This is best done using [NVIDIA's Multi-Process Service (MPS)].
6666
To use MPS, launch your application using the following wrapper script, which will start MPS on one rank per node and assign GPUs to ranks according to the CPU mask of a rank, ensuring the closest GPU is used:
6767

6868
```bash
@@ -123,6 +123,8 @@ Note that in the example job above:
123123

124124
The configuration that is optimal for your application may be different.
125125

126+
[NVIDIA's Multi-Process Service (MPS)]: https://docs.nvidia.com/deploy/mps/index.html
127+
126128
[](){#amdcpu-slurm}
127129
## AMD CPU
128130

0 commit comments

Comments
 (0)