You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/recipes.md
+91-32Lines changed: 91 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -100,7 +100,7 @@ To provide a single Spack stack that meets the workflow's needs, we would create
100
100
# A GCC-based programming environment
101
101
prgenv-gnu:
102
102
compiler: # ... compiler toolchain
103
-
mpi: # ... mpi configuration
103
+
network: # ... network configuration
104
104
deprecated: # ... whether to allow usage of deprecated packages or not
105
105
unify: # ... configure Spack concretizer
106
106
specs: # ... list of packages to install
@@ -146,48 +146,107 @@ For example, in the recipe below, only `netcdf-fortran` will be built with the `
146
146
!!! note
147
147
This approach is typically used to build Fortran applications and packages with one toolchain (e.g. `nvhpc`), and all of the C/C++ dependencies with a different toolchain (e.g. `gcc`).
148
148
149
-
### MPI
149
+
[](){#ref-recipes-network}
150
+
### MPI and networking
150
151
151
-
Stackinator can configure cray-mpich (CUDA, ROCM, or non-GPU aware) on a per-environment basis, by setting the `mpi` field in an environment.
152
+
Stackinator can configure MPI (cray-mpich and OpenMPI) its dependencies (libfabric, cxi, etc) through the `network` field.
152
153
153
-
!!! note
154
-
Future versions of Stackinator will support OpenMPI, MPICH and MVAPICH when (and if) they develop robust support for HPE SlingShot 11 interconnect.
154
+
!!! note ""
155
+
The `network` field replaces the `mpi` field in Stackinator 6.
156
+
See the [porting guide][ref-porting-network] for guidance on updating uenv recipes for Spack 1.0.
157
+
158
+
If the `network` field is not set, or is set to `null`, MPI will not be configured in an environment:
155
159
156
-
If the `mpi` field is not set, or is set to `null`, MPI will not be configured in an environment:
157
-
```yaml title="environments.yaml: no MPI"
160
+
```yaml title="environments.yaml no network/mpi stack"
158
161
serial-env:
159
-
mpi: null
160
-
# ...
162
+
network: null
161
163
```
162
164
163
-
To configure MPI without GPU support, set the `spec` field with an optional version:
164
-
```yaml title="environments.yaml: MPI without GPU support"
165
-
host-env:
166
-
mpi:
167
-
spec: cray-mpich@8.1.23
168
-
# ...
169
-
```
165
+
The `network` field has separate fields for defining cray-mpich, OpenMPI and additional custom package definitions
170
166
171
-
GPU-aware MPI can be configured by setting the optional `gpu` field to specify whether to support `cuda` or `rocm` GPUs:
172
-
```yaml title="environments.yaml: GPU aware MPI"
173
-
cuda-env:
174
-
mpi:
175
-
spec: cray-mpich
176
-
gpu: cuda
177
-
# ...
178
-
rocm-env:
179
-
mpi:
180
-
spec: cray-mpich
181
-
gpu: rocm
182
-
# ...
167
+
```yaml title="enironments.yaml overview of options"
168
+
<env-name>:
169
+
network:
170
+
cray-mpich: # describe cray-mpich (can not be used with openmpi)
171
+
openmpi: # describe openmpi (can not be used with cray-mpich)
172
+
specs: # additional custom specs for dependencies (libfabric etc)
183
173
```
184
174
175
+
#### Configuring MPI
176
+
185
177
!!! alps
186
178
187
-
As new versions of cray-mpich are released with CPE, they are provided on Alps vClusters, via the Spack package repo in the [CSCS cluster configuration repo](https://github.com/eth-cscs/alps-cluster-config/tree/main/site/spack_repo/alps).
179
+
The recommended MPI distribution on Alps is `cray-mpich`, as it is the most widely tested MPI for the libfabric/slingshot network.
188
180
189
-
!!! note
190
-
The `cray-mpich` spec is added to the list of package specs automatically, and all packages that use the virtual dependency `+mpi` will use this `cray-mpich`.
181
+
OpenMPI's support for the Slingshot network is improving, however it may not be optimal for many applications, or requires more effort to fine tune.
182
+
As such, it is recommended as an option for applications that have performance issues or bugs with cray-mpich.
183
+
184
+
185
+
It is only possible to have one MPI implementation in an environment - choose one of `cray-mpich` or `openmpi`.
186
+
187
+
Most of the time, you will want to use the "defaults" that are configured in [alps-cluster-config](https://github.com/eth-cscs/alps-cluster-config)
188
+
189
+
=== "cray-mpich"
190
+
191
+
```yaml
192
+
network:
193
+
cray-mpich:
194
+
gpu: <one of cuda, rocm or null> # default is system specific
195
+
version: <one of version string or null>
196
+
```
197
+
198
+
=== "openmpi"
199
+
200
+
```yaml
201
+
network:
202
+
openmpi:
203
+
gpu: <one of cuda, rocm or null> # default is system specific
204
+
version: <one of version string or null>
205
+
```
206
+
??? question "What are the defaults?"
207
+
The defaults are cluster-specific, for example on a system with NVIDIA GPUs, cray-mpich and openmpi will probably be configured to enable `gpu=cuda` by default.
208
+
209
+
See the `network.yaml` file in the cluster configuration for the default flags, and for the definitions of the `cray-mpich`, `openmpi`, `libfabric`, and `libcxi` Spack packages.
210
+
211
+
Possibly changing which version of MPI or whether to enable GPU support, as shown in the following examples:
212
+
213
+
!!! example "configure with the defaults"
214
+
=== "cray-mpich"
215
+
216
+
Choose the default version of cray-mpich with cuda support enabled.
217
+
218
+
```yaml
219
+
network:
220
+
cray-mpich:
221
+
gpu: cuda
222
+
```
223
+
224
+
=== "openmpi"
225
+
226
+
Choose the openmpi version 5.0.6 with the default GPU support for the target system.
227
+
228
+
```yaml
229
+
network:
230
+
openmpi:
231
+
version: 5.0.6
232
+
```
233
+
234
+
It is possible to fully customise how MPI is built by providing the full spec instead of setting individual sub-options.
235
+
This is an advanced option, that the majority of uenv authors will not use, because stackinator aims to simplify MPI deployment on HPC Alps.
Note that when customising the full spec, you will probably also need to fine tune the network stack dependencies using `specs`.
246
+
247
+
#### Custimsing network dependences with specs
248
+
249
+
You can provide
191
250
192
251
### Specs
193
252
@@ -224,7 +283,7 @@ cuda-env:
224
283
Use `unify:true` when possible, then `unify:when_possible`, and finally `unify:false`.
225
284
226
285
!!! warning
227
-
Don't provide a spec for MPI or Compilers, which are configured in the [`mpi:`](recipes.md#mpi) and [`compilers`](recipes.md#compilers) fields respecively.
286
+
Don't provide a spec for MPI or Compilers, which are configured in the [`network:`][ref-recipes-network] and [`compilers`](recipes.md#compilers) fields respectively.
228
287
229
288
!!! warning
230
289
Stackinator does not support "spec matrices", and likely won't, because they use multiple compiler toolchains in a manner that is contrary to the Stackinator "keep it simple" principle.
0 commit comments