Skip to content

Commit 2bcea80

Browse files
committed
document --cpu-bind=verbose
1 parent edbebe5 commit 2bcea80

File tree

1 file changed

+16
-2
lines changed

1 file changed

+16
-2
lines changed

docs/running/slurm.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -145,8 +145,6 @@ The build generates the following executables:
145145
* all threads on each rank have affinity with the same 72 cores;
146146
* each rank gets 72 cores, e.g. rank 1 gets cores `72:143` on node `nid006363`.
147147

148-
149-
150148
??? example "Testing GPU affinity"
151149
Use `affinity.cuda` or `affinity.rocm` to test on GPU-enabled systems.
152150

@@ -197,6 +195,22 @@ The build generates the following executables:
197195

198196
2. Test GPU affinity: note how the `--gpus-per-task=1` parameter assings a unique GPU to each rank.
199197

198+
!!! info "Quick affinity checks"
199+
200+
The Slurm flag [`cpu-bind=verbose`](https://slurm.schedmd.com/srun.html#OPT_cpu-bind) prints information about MPI ranks and their thread affinity.
201+
202+
The mask it prints is not very readable, but it can be used with the `true` command to quickly test Slurm parameters without building the Affinity tool.
203+
204+
```console title="hello"
205+
$ srun --cpu-bind=verbose -c32 -n4 -N1 --hint=nomultithread -- true
206+
cpu-bind=MASK - nid002156, task 0 0 [147694]: mask 0xffffffff set
207+
cpu-bind=MASK - nid002156, task 1 1 [147695]: mask 0xffffffff0000000000000000 set
208+
cpu-bind=MASK - nid002156, task 2 2 [147696]: mask 0xffffffff00000000 set
209+
cpu-bind=MASK - nid002156, task 3 3 [147697]: mask 0xffffffff000000000000000000000000 set
210+
```
211+
212+
You can also check GPU affinity by inspecting the value of the `CUDA_VISIBLE_DEVICES` environment variable.
213+
200214
[](){#ref-slurm-gh200}
201215
## NVIDIA GH200 GPU Nodes
202216

0 commit comments

Comments
 (0)