Skip to content

Commit abb640c

Browse files
committed
Split EDF and executon console result
1 parent 18771ba commit abb640c

File tree

2 files changed

+9
-7
lines changed

2 files changed

+9
-7
lines changed

docs/software/container-engine/known-issue.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@
22

33
Alpine Linux is incompatible with some hooks, causing errors when used with Slurm. For example,
44

5-
```console
6-
$ cat alpine.toml
5+
```toml title="EDF: ${EDF_PATH}/alpine.toml"
76
image = "alpine:3.19"
7+
```
88

9-
$ srun -lN1 --environment=alpine.toml echo "abc"
9+
```console
10+
$ srun -lN1 --environment=alpine echo "abc"
1011
0: slurmstepd: error: pyxis: container start failed with error code: 1
1112
0: slurmstepd: error: pyxis: printing enroot log file:
1213
0: slurmstepd: error: pyxis: [ERROR] Failed to refresh the dynamic linker cache
@@ -18,10 +19,11 @@ $ srun -lN1 --environment=alpine.toml echo "abc"
1819

1920
This is because some hooks (e.g., Slurm and CXI hooks) leverage `ldconfig` (from Glibc) when they bind-mount host libraries inside containers; since Alpine Linux provides an alternative `ldconfig` (from Musl Libc), it does not work as intended by hooks. As a workaround, users may disable problematic hooks. For example,
2021

21-
```console
22-
$ cat alpine_workaround.toml
22+
```toml title="EDF: ${EDF_PATH}/alpine_workaround.toml"
2323
image = "alpine:3.19"
24+
```
2425

26+
```console
2527
[annotations]
2628
com.hooks.cxi.enabled = "false"
2729

docs/software/container-engine/resource-hook.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ The hook can be activated by setting the `com.hooks.nvidia_cuda_mps.enabled` to
281281
```
282282

283283
??? example "Available GPUs and oversubscription error"
284-
```toml title="EDF: ${EDF_PATH}/vectoradd-cuda.toml
284+
```toml title="EDF: ${EDF_PATH}/vectoradd-cuda.toml"
285285
image = "nvcr.io#nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" # (1)
286286
```
287287

@@ -316,7 +316,7 @@ GPU device files are always mounted in containers, and the NVIDIA driver user sp
316316
Such images are frequently used to containerize CUDA applications, either directly or as a base for custom images, thus in many cases no action is required to access GPUs.
317317

318318
!!! example "Cluster with 4 GH200 devices per node"
319-
```toml title="EDF: ${EDF_PATH}/cuda12.5.1.toml
319+
```toml title="EDF: ${EDF_PATH}/cuda12.5.1.toml"
320320
image = "nvidia/cuda:12.5.1-devel-ubuntu24.04"
321321
```
322322

0 commit comments

Comments
 (0)