You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tools/cicd.md
+220-4Lines changed: 220 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -118,7 +118,7 @@ A typical error is accepting to defaults of GitHub for new webhooks, where only
118
118
119
119
120
120
[](){#ref-cicd-pipeline-triggers}
121
-
### Understanding when CI is triggered
121
+
## Understanding when CI is triggered
122
122
[](){#ref-cicd-pipeline-triggers-push}
123
123
#### Push events
124
124
- Every pipeline can define its own list of CI-enabled branches
@@ -183,19 +183,235 @@ Typical users do not need to know the underlying workflow behind the scenes, so
183
183
1. If the repository uses git submodules, `GIT_SUBMODULE_STRATEGY: recursive` has to be specified (see [GitLab documentation](https://docs.gitlab.com/ee/ci/git_submodules.html#use-git-submodules-in-cicd-jobs))
184
184
1. The `container-builder`, which has as input a Dockerfile (specified in the variable `DOCKERFILE`), will take this Dockerfile and execute something similar to `docker build -f $DOCKERFILE .`, where the build context is the whole (recursively) cloned repository
185
185
186
-
### CI variables
186
+
## CI variables
187
187
188
188
Many variables exist during a pipeline run, they are documented at [Gitlab's predefined variables](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html). Additionally to CI variables available through Gitlab, there are a few CSCS specific pipeline variables:
| `CSCS_REGISTRY` | jfrog.svc.cscs.ch | CSCS internal registry, preferred registry to store your container images |
193
193
| `CSCS_REGISTRY_PATH` | jfrog.svc.cscs.ch/docker-ci-ext/<repositorypid> | The prefix path in the CSCS internal container image registry, to which your pipeline has write access. Within this prefix, you can choose any directory structure. Images that are pushed to a path matching **/public/** , can be pulled by anybody within CSCS network |
194
194
| `CSCS_CI_MW_URL` | https://cicd-ext-mw.cscs.ch/ci | The URL of the middleware, the orchestrator software. |
195
195
| `CSCS_CI_DEFAULT_SLURM_ACCOUNT` | d123 | The project to which accounting will go to. It is set up on the CI setup page in the Admin section. It can be overwritten via SLURM_ACCOUNT for individual jobs. |
196
196
| `CSCS_CI_ORIG_CLONE_URL` | https://github.com/my-org/my-project (public) [email protected]:my-or/my-project (private) | Clone URL for git. This is needed for some implementation details of the gitlab-runner custom executor. This is the clone URL of the registered project, i.e. this is not the clone URL of the mirror project. |
197
197
198
-
### Example projects
198
+
## Containerized CI - best practices
199
+
### Multi-architecture images
200
+
201
+
With the introduction of Grace-Hopper nodes, we have now `aarch64` and `x86_64` machines. This implies that the container images should be built for the correct architecture. This can be achieved by the following example
We first create two container images which have different names. Then we combine these two names to a single name, with both architectures. Finally in the run step we use the multi-architecture image, where the container runtime will pull the correct architecture.
241
+
242
+
It is *not* mandatory to combine the container images to a multi-architecture image, i.e. a CI setup which consistently uses the correct architecture specific paths can work. A multi-architecture image is convenient when you plan to distribute it to other users.
243
+
244
+
### Dependency management
245
+
#### Problem
246
+
247
+
A common observation is that your software has many dependencies that are more or less static, i.e. they can change but do so very rarely. A common pattern one can observe to work around rebuilding base images unnecessarily is a multi-stage CI setup
248
+
249
+
1. Build (rarely but manually) a base container with all static dependencies and push it to a public container registry
250
+
1. Use the base container and build the software container
251
+
1. Test the newly created software container
252
+
1. Deploy the software container
253
+
254
+
This works fine but has the drawback that one has to do a manual step whenever the dependencies change, e.g. when one wants to upgrade to new versions of the dependencies. Another drawback of this is that it allows to keep the recipe of the base container outside of the repository, which makes it harder to reproduce results, especially when colleagues want to reproduce a build.
255
+
256
+
#### Solution
257
+
258
+
A common solution to this problem is that you have a multi stage setup. Your repository should have (at least) two Dockerfiles, let us call them `Dockerfile.base` and `Dockerfile`.
259
+
260
+
- `Dockerfile.base`: This dockerfile contains the recipe to build your base-container, it normally derives `FROM` a very basic container, e.g. `docker.io/ubuntu:24.04` or CSCS spack base containers. Let us call the container image that is built using this recipe `BASE_IMG`.
261
+
!!! todo
262
+
link to spack base containers
263
+
- `Dockerfile`: This Dockerfile contains the recipe to build your software-container. It must start with `FROM $BASE_IMG`.
264
+
265
+
The `.container-builder-cscs-*` blocks can be used to solve this problem. The runner supports the variable `CSCS_REBUILD_POLICY`, which by default is set to `if-not-exists`.
266
+
267
+
This means that the runner will check the remote registry if the container image specified in `PERSIST_IMAGE_NAME` exists. A new container image is built only if it does not exist yet. Note: In case you have one build job, `PERSIST_IMAGE_NAME` can be specified in the `variables:` field of this build job or as a global variable, like in the Hello World example. In case you have multiple build jobs and you specify the `PERSIST_IMAGE_NAME` variable per build job, you need to specify the exact name of the image to be used in the `image` field of the test job.
268
+
269
+
A CI YAML file would look in the simplest case like this:
FROM docker.io/finkandreas/spack:0.19.2-cuda11.7.1-ubuntu22.04
319
+
320
+
ARG NUM_PROCS
321
+
322
+
RUN spack-install-helper daint-gpu \
323
+
petsc \
324
+
trilinos
325
+
```
326
+
327
+
`ci/docker/Dockerfile`
328
+
```Dockerfile
329
+
ARG BASE_IMG
330
+
FROM $BASE_IMG
331
+
332
+
ARG NUM_PROCS
333
+
334
+
RUN mkdir /build && cd /build && cmake /sourcecode && make -j$NUM_PROCS
335
+
```
336
+
337
+
A setup like this would run the very first time and build the container image `$CSCS_REGISTRY_PATH/base/my_base_container:1.0`, followed by the job that builds the container image `$CSCS_REGISTRY_PATH/software/my_software:1.0`. The next time CI is triggered the `.container-builder-cscs-zen2` would check the remote repository if the target tag (`PERSIST_IMAGE_NAME`) exists, and only build a new container image if it does not exist yet. Since the tag for the job `build base` is static, i.e. it is the same for every run of CI, it would build the first time it is running, but not for subsequent runs. In contrast to this is the job `build software`: Here the tag changes with every CI run, since the variable `CI_COMMIT_SHORT_SHA` is different for every run.
338
+
339
+
##### Manual dependency update
340
+
At some point you realise that you have to update some of the dependencies. You can use a manual update process to update your base-container, where you ensure that you update all necessary image tags. In our example, this means updating in `ci/cscs.yml` all occurences of `$CSCS_REGISTRY_PATH/base/my_base_container:1.0` to `$CSCS_REGISTRY_PATH/base/my_base_container:2.0` (or any other versioning scheme - for all that matters is that the full name must change). Of course something in `Dockerfile.base` should change too, otherwise you are building the same artifact, with just a different name.
341
+
342
+
##### Dynamic dependency update
343
+
While manually updating image tags works fine, it has the drawback that it is error-prone. Take for example the situation where you update the tag in `build base`, but forget to change it in `build software`. Your pipeline would still run fine, because the dependency of `build software` exists. Since there is no explicit error for the inconsistencies it is hard to find the error.
344
+
345
+
Therefore, there is also the possibility to have a dynamic way of naming your container images. The idea is the same, i.e. we build first a base-container, and use this base-container to build our software-container.
346
+
347
+
The `build base` and `build software` jobs would look similar to this:
348
+
```yaml
349
+
build base:
350
+
extends: .container-builder-cscs-zen2
351
+
stage: build_base
352
+
before_script:
353
+
- DOCKER_TAG=`cat ci/docker/Dockerfile.base | sha256sum - | head -c 16`
Let us walk through the changes in the `build base` job:
372
+
373
+
- `DOCKER_TAG`is computed at runtime by the sha256sum of the `Dockerfile.base`, i.e. it would change, when you change the content of `Dockerfile.base` (we keep only the first 16 characters, this is random enough to guarantee that we have a unique name).
374
+
- We export `PERSIST_IMAGE_NAME` to the dynamic name with `DOCKER_TAG`.
375
+
- We write the dynamic name to the file `build.env`
376
+
- We tell the CI system to keep the `build.env` as an artifact (see [here](https://docs.gitlab.com/ee/ci/yaml/artifacts_reports.html#artifactsreportsdotenv) the documentation of this)
377
+
378
+
Note: The dotenv artifacts of a specific job for public projects is available at `https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/<project_id>/<pipeline_id>/-/jobs/<job_id>/artifacts/download?file_type=dotenv`.
379
+
380
+
Now let us look at the changes in the `build software` job:
381
+
382
+
- `DOCKER_BUILD_ARGS`is now using `$BASE_IMAGE`. This variable exists, because we transferred the information via a `dotenv` artifact from `build base` to this job.
383
+
384
+
In this example the names `BASE_IMG` and `BASE_IMAGE` are chosen to be different, for clarification where the different variables are set and used. Feel free to use the same names for consistent naming. The default behaviour is to import all artifacts from all previous jobs. If you want only specific artifacts in your job, you should have a look at [dependencies](https://docs.gitlab.com/ee/ci/yaml/#dependencies).
385
+
386
+
There is also a building block in the templates, name `.dynamic-image-name`, which you can use to get rid for most of the boilerplate. It is important to note that this building block will export the dynamic name under the hardcoded name `BASE_IMAGE` in the `dotenv` file. The jobs would look something like this:
`build base`is using additionally the building block `.dynamic-image-name`, while `build software` is unchanged. Have a look at the definition of the block `.dynamic-image-name` in the file [.ci-ext.yml](https://gitlab.com/cscs-ci/recipes/-/blob/master/templates/v2/.ci-ext.yml) for further notes.
406
+
407
+
#### Examples
408
+
See for working examples these two yaml files (and check the respective Dockerfiles mentioned in the build jobs)
0 commit comments