Skip to content

Commit f4ccf1f

Browse files
committed
Split EDF and executon console result
1 parent 9ef8ae4 commit f4ccf1f

File tree

2 files changed

+13
-10
lines changed

2 files changed

+13
-10
lines changed

docs/software/container-engine/known-issue.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Alpine Linux is incompatible with some hooks, causing errors when used with Slurm. For example,
44

5-
```toml title="EDF: `alpine.toml` at `${EDF_PATH}`"
5+
```toml title="EDF: alpine.toml"
66
image = "alpine:3.19"
77
```
88

@@ -19,7 +19,7 @@ $ srun -lN1 --environment=alpine echo "abc"
1919

2020
This is because some hooks (e.g., Slurm and CXI hooks) leverage `ldconfig` (from Glibc) when they bind-mount host libraries inside containers; since Alpine Linux provides an alternative `ldconfig` (from Musl Libc), it does not work as intended by hooks. As a workaround, users may disable problematic hooks. For example,
2121

22-
```toml title="EDF: `alpine_workaround.toml` at `${EDF_PATH}`"
22+
```toml title="EDF: alpine_workaround.toml"
2323
image = "alpine:3.19"
2424
```
2525

docs/software/container-engine/resource-hook.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,9 @@ Container hooks let you customize container behavior to fit system-specific need
6666
However, specific Alps vClusters may support only a subset or use custom configurations.
6767
For details about available features in individual vClusters, consult platform documentation or contact CSCS support.
6868

69+
!!! note
70+
In the examples below, EDF files are assumed to be at `${EDF_PATH}`.
71+
6972
[](){#ref-ce-cxi-hook}
7073
### HPE Slingshot interconnect 
7174

@@ -92,14 +95,14 @@ The hook is activated by setting the `com.hooks.cxi.enabled` annotation, which
9295
??? example "Comparison between with and without the CXI hook"
9396
* Without the CXI hook
9497

95-
```toml title="EDF: ${EDF_PATH}/osu-mb-wo-cxi.toml"
98+
```toml title="EDF: osu-mb-wo-cxi.toml"
9699
image = "quay.io#madeeks/osu-mb:6.2-mpich4.1-ubuntu22.04-arm64"
97100

98101
[annotations]
99102
com.hooks.cxi.enabled = "false"
100103
```
101104

102-
```console
105+
```console title="Command-line"
103106
$ srun -N2 --mpi=pmi2 --environment=osu-mb-wo-cxi ./osu_bw
104107
# OSU MPI Bandwidth Test v6.2
105108
# Size Bandwidth (MB/s)
@@ -130,14 +133,14 @@ The hook is activated by setting the `com.hooks.cxi.enabled` annotation, which
130133

131134
* With the CXI hook enabling access to the Slingshot high-speed network
132135

133-
```toml title="EDF: ${EDF_PATH}/osu-mb-cxi.toml"
136+
```toml title="EDF: osu-mb-cxi.toml"
134137
image = "quay.io#madeeks/osu-mb:6.2-mpich4.1-ubuntu22.04"
135138

136139
[annotations]
137140
com.hooks.cxi.enabled = "true"
138141
```
139142

140-
```console
143+
```console title="Command-line"
141144
$ srun -N2 --mpi=pmi2 --environment=osu-mb-cxi ./osu_bw
142145
# OSU MPI Bandwidth Test v6.2
143146
# Size Bandwidth (MB/s)
@@ -230,7 +233,7 @@ By default, the server started by the SSH hook listens to port 15263, but this s
230233

231234
!!! example "Logging into a sleeping container via SSH"
232235
* On the cluster
233-
```toml title="EDF: ${EDF_PATH}/ubuntu-ssh.toml"
236+
```toml title="EDF: ubuntu-ssh.toml"
234237
image = "ubuntu:latest"
235238

236239
[annotations]
@@ -267,7 +270,7 @@ The hook can be activated by setting the `com.hooks.nvidia_cuda_mps.enabled` to
267270
The container must be **writable** (default) to use the CUDA MPS hook.
268271

269272
!!! example "Using the CUDA MPS hook"
270-
```toml title="EDF: `vectoradd-cuda-mps.toml` at `${EDF_PATH}`"
273+
```toml title="EDF: vectoradd-cuda-mps.toml"
271274
image = "nvcr.io#nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
272275

273276
[annotations]
@@ -280,7 +283,7 @@ The hook can be activated by setting the `com.hooks.nvidia_cuda_mps.enabled` to
280283
```
281284

282285
??? example "Available GPUs and oversubscription error"
283-
```toml title="EDF: `vectoradd-cuda.toml` at `${EDF_PATH}`"
286+
```toml title="EDF: vectoradd-cuda.toml"
284287
image = "nvcr.io#nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" # (1)
285288
```
286289

@@ -315,7 +318,7 @@ GPU device files are always mounted in containers, and the NVIDIA driver user sp
315318
Such images are frequently used to containerize CUDA applications, either directly or as a base for custom images, thus in many cases no action is required to access GPUs.
316319

317320
!!! example "Cluster with 4 GH200 devices per node"
318-
```toml title="EDF: `cuda12.5.1.toml` at `${EDF_PATH}`"
321+
```toml title="EDF: cuda12.5.1.toml"
319322
image = "nvidia/cuda:12.5.1-devel-ubuntu24.04"
320323
```
321324

0 commit comments

Comments
 (0)