You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Replace `<public-key>` with your SSH public key.
242
+
1. Replace `<public-key>` with the path to your SSH public key file.
243
243
244
244
The SSH hook runs a lightweight, statically-linked SSH server (a build of [Dropbear](https://matt.ucc.asn.au/dropbear/dropbear.html)) inside the container.
245
245
While the container is running, it's possible to connect to it from a remote host using a private key matching the public one authorized in the EDF annotation.
246
246
It can be useful to add SSH connectivity to containers (for example, enabling remote debugging) without bundling an SSH server into the container image or creating ad-hoc image variants for such purposes.
247
247
248
248
The `com.hooks.ssh.authorize_ssh_key` annotation allows the authorization of a custom public SSH key for remote connections.
249
-
The annotation value must be the absolute path to a text file containing the public key (just the public key without any extra signature/certificate).
249
+
The annotation value must be the absolute path to a *text file* containing the public key (just the public key without any extra signature/certificate).
250
+
The annotation value should not be the public SSH key itself.
250
251
After the container starts, it is possible to get a remote shell inside the container by connecting with SSH to the listening port.
251
252
252
253
By default, the server started by the SSH hook listens to port 15263, but this setting can be controlled through the `com.hooks.ssh.port` annotation in the EDF.
@@ -312,7 +313,7 @@ The hook can be activated by setting the `com.hooks.nvidia_cuda_mps.enabled` to
312
313
8
313
314
```
314
315
315
-
??? example "Available GPUs and oversubscription error"
316
+
??? example "Available GPUs and oversubscription error*without* the CUDA MPS hook"
Copy file name to clipboardExpand all lines: docs/software/container-engine/run.md
+20-5Lines changed: 20 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -84,14 +84,14 @@ If an EDF is located in the search path, its name can be used in the `--environm
84
84
...
85
85
```
86
86
87
-
## Using container images
87
+
## Managing container images
88
88
89
89
By default, images defined in the EDF as remote registry references (e.g. a Docker reference) are automatically pulled and locally cached.
90
90
A cached image would be preferred to pulling the image again in later usage.
91
91
92
92
An image cache is automatically created at `.edf_imagestore` in the user's scratch folder (i.e., `${SCRATCH}/.edf_imagestore`). Cached images are stored with the corresponding CPU architecture suffix (e.g., `x86` and `aarch64`). Remove the cached image to force re-pull.
93
93
94
-
An alternative image store path can be specify by defining the environment variable `EDF_IMAGESTORE`. `EDF_IMAGESTORE` must be an absolute path to an existing folder. Image caching may also be disable by setting `EDF_IMAGESTORE` to `void` (currently only available on Daint and Santis).
94
+
An alternative image store path can be specify by defining the environment variable `EDF_IMAGESTORE`. `EDF_IMAGESTORE` must be an absolute path to an existing folder. Image caching may also be disable by setting `EDF_IMAGESTORE` to `void`.
95
95
96
96
!!! note
97
97
* If the CE cannot create a directory for the image cache, it operates in cache-free mode, meaning that it pulls an ephemeral image before every container launch and discards it upon termination.
@@ -227,9 +227,6 @@ See [the EDF reference][ref-ce-edf-reference] for the full specification of the
227
227
[](){#ref-ce-run-mounting-squashfs}
228
228
### Mounting a SquashFS image
229
229
230
-
!!! warning
231
-
This feature is only available on some vClusters (Daint and Santis, as of 17.06.2025).
232
-
233
230
A SquashFS image, essentially being a compressed data archive, can also be mounted _as a directory_ so that the image contents are readable inside the container. For this, `:sqsh` should be appended after the destination.
234
231
235
232
!!! example "Mounting a SquashFS image `${SCRATCH}/data.sqsh` to `/data`"
@@ -238,3 +235,21 @@ A SquashFS image, essentially being a compressed data archive, can also be mount
238
235
```
239
236
240
237
This is particularly useful if a job should read _multiple_ data files _frequently_, which may cause severe file access overheads. Instead, it is recommended to pack data files into one data SquashFS image and mount it inside a container. See the *"magic phrase"* in [this documentation](https://tldp.org/HOWTO/SquashFS-HOWTO/creatingandusing.html) for creating a SquashFS image.
238
+
239
+
240
+
## Differences from upstream Pyxis
241
+
242
+
The Container Engine currently uses a customized version of [NVIDIA Pyxis](https://github.com/NVIDIA/pyxis) to integrate containers with Slurm.
243
+
244
+
Compared to the original, upstream Pyxis code, the following user-facing differences should be noted:
245
+
246
+
!!! note
247
+
As of September 10th, 2025, these items apply only to the Clariden and Santis vClusters.
248
+
249
+
* **Disabled remapping of PyTorch-related variables:** upstream Pyxis automatically remaps the `RANK` and `LOCAL_RANK` environment variables used by PyTorch to match the `SLURM_PROCID` and `SLURM_LOCALID` variables, respectively, if the `PYTORCH_VERSION` variable is detected in the container's environment.
250
+
This behavior has been **disabled** by default.
251
+
The remapping can be reactivated by setting the [annotation][ref-ce-annotations] `com.pyxis.pytorch_remap_vars="true"` in the EDF.
252
+
253
+
* **Logging container entrypoint output through EDF annotation:** by default, Pyxis hides the output of the container's entrypoint, if the latter is used.
254
+
To make the entrypoint output printed on the stdout stream of the Slurm job, upstream Pyxis provides the `--container-entrypoint-log` CLI option for `srun`.
255
+
In the Pyxis version used by the Container Engine, entrypoint output printing can also be enabled by setting the [annotation][ref-ce-annotations] `com.pyxis.entrypoint_log="true"` in the EDF.
0 commit comments