Skip to content

Commit 6f2dd34

Browse files
authored
Merge branch 'main' into kubernetes-docs
2 parents 6c0c893 + d54335f commit 6f2dd34

File tree

12 files changed

+300
-30
lines changed

12 files changed

+300
-30
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ docs/services/firecrest @jpdorsch @ekouts
44
docs/services/kubernetes @eliaoggian
55
docs/software/communication @Madeeks @msimberg
66
docs/software/devtools/linaro @jgphpc
7+
docs/software/devtools/vihps @jgphpc
78
docs/software/prgenv/linalg.md @finkandreas @msimberg
89
docs/software/sciapps/cp2k.md @abussy @RMeli
910
docs/software/sciapps/lammps.md @nickjbrowning

.github/actions/spelling/allow.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,3 +347,8 @@ xname
347347
xpmem
348348
youtube
349349
zstd
350+
HPS
351+
jobscript
352+
Scalasca
353+
tracefile
354+
Vampir

docs/guides/terminal.md

Lines changed: 52 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,26 +8,24 @@ This documentation is a collection of guides, hints, and tips for setting up you
88

99
Every user has a shell that will be used when they log in, with [bash](https://www.gnu.org/software/bash/) as the default shell for new users at CSCS.
1010

11-
At CSCS the vast majority of users stick with the default `bash`: at the time of writing, of over 1000 users on Daint, over 99% were using bash.
12-
1311
!!! example "Which shell am I using?"
1412

1513
Run the following command after logging in:
1614

1715
```console
18-
$ getent passwd | grep $USER
19-
bcumming:*:22008:1000:Benjamin Cumming, CSCS:/users/bcumming:/usr/local/bin/bash
16+
$ echo $SHELL
17+
/usr/local/bin/bash
2018
```
2119

22-
The last entry in the output points to the shell of the user, in this case `/usr/local/bin/bash`.
23-
2420
!!! tip
2521
If you would like to change your shell, for example to [zsh](https://www.zsh.org), you have to open a [service desk](https://jira.cscs.ch/plugins/servlet/desk) ticket to request the change. You can't make the change yourself.
2622

2723

2824
!!! warning
29-
Because `bash` is used by all CSCS staff and the overwhelming majority of users, it is the best tested, and safest default.
30-
25+
If you are comfortable with another shell (like Zsh or Fish), you are welcome to switch.
26+
Just keep in mind that some tools and instructions might not work the same way outside of `bash`.
27+
Since our support and documentation are based on the default setup, using a different shell might make it harder to follow along or get help.
28+
3129
We strongly recommend against using cshell - tools like uenv are not tested against it.
3230

3331
[](){#ref-guides-terminal-arch}
@@ -76,3 +74,49 @@ export PATH=$xdgbase/bin:$PATH
7674
!!! note "XDG what?"
7775
The [XDG base directory specification](https://specifications.freedesktop.org/basedir-spec/latest/) is used by most applications to determine where to look for configurations, and where to store data and temporary files.
7876

77+
[](){#ref-guides-terminal-bashrc}
78+
## Modifying bashrc
79+
80+
The `~/.bashrc` in your home directory is executed __every time__ you log in, and there is no way to log in without executing it.
81+
82+
It is strongly recommended that customization in `~/.bashrc` should be kept to the bare minimum:
83+
84+
1. It sets a fixed set of environment options every time you log in, and all downstream scripts and Slurm batch jobs might assume that these commands have run, so that later modifications to `~/.bashrc` can break workflows in ways that are difficult to debug.
85+
* If a script or batch job requires environment modifications, implement them there.
86+
* In other words, move the definition of environment used by a workflow to the workflow definition.
87+
1. It makes it difficult for CSCS to provide support, because it is difficult for support staff to reproduce your environment, and it can take a lot of back and forth before we determine that the root cause of an issue is a command in `~/.bashrc`.
88+
89+
90+
!!! warning "Do not call `module` in bashrc"
91+
Calls to `module use` and `module load` in `~/.bashrc` is possible, however avoid it for the reasons above.
92+
If there are module commands in your `~/.bashrc`, remember to provide a full copy of `~/.bashrc` with support tickets.
93+
94+
!!! danger "Do not call `uenv` in bashrc"
95+
The `uenv` command is designed for creating isolated environments, and calling it in `~/.bashrc` will not work as expected.
96+
See the [uenv docs][ref-uenv-customenv] for more information about how to create bespoke uenv environments that can be started with a single command.
97+
98+
??? note "Help, I broke bashrc!"
99+
It is possible to add commands to bashrc that will stop you from being able to log in.
100+
The author of these docs has done it more than once, after ignoring their own advice.
101+
102+
For example, if the command `exit` is added to `~/.bashrc` you will be logged out every time you log in.
103+
104+
The first thing to try is to execute a command that will back up `~/.bashrc`, and remove `~/.bashrc`:
105+
```bash
106+
ssh eiger.cscs.ch 'bash --norc --noprofile -c "mv ~/.bashrc ~/.bashrc.back"'
107+
```
108+
If this works, you can then log in normally, and edit the backup and copy it back to `~/.bashrc`.
109+
110+
If there is a critical error, like calling `exit`, the approach above won't work.
111+
In such cases, the only solution that doesn't require root permissions is to log in and hit `<ctrl-c>` during the log in.
112+
With luck, this will cancel the login process before `~/.bashrc` is executed, and you will be able to edit and fix `~/.bashrc`.
113+
Note that you might have to try a few times to get the timing right.
114+
115+
If this does not work, create a [service desk ticket][ref-get-in-touch] with the following message:
116+
117+
!!! example "Help request"
118+
My bashrc has been modified, and I can't log in any longer to `insert-system-name`.
119+
My username is `insert-cscs-username`.
120+
Can you please make a backup copy of my bashrc, i.e. `mv ~/.bashrc ~/.bashrc.back`,
121+
so that I can log in and fix the issue.
122+
1.33 MB
Loading
329 KB
Loading

docs/services/cicd.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -216,19 +216,21 @@ It is the same thing.
216216
[](){#ref-cicd-pipeline-triggers-api}
217217
#### API call triggering
218218
- It is possible to trigger a pipeline via an API call
219-
- Create a file named `data.yaml`, with the content
220-
```yaml
221-
ref: main
222-
pipeline: pipeline_name
223-
variables:
224-
MY_VARIABLE: some_value
225-
ANOTHER_VAR: other_value
226-
```
227-
Send a POST request to the middleware
228-
```bash
229-
curl -X POST -u 'repository_id:webhook_secret' --data-binary @data.yaml https://cicd-ext-mw.cscs.ch/ci/pipeline/trigger
230-
```
231-
- replace repository_id and webhook_secret with your repository id and the webhook secret.
219+
- Create a file `data.yaml`, with the content
220+
```yaml title="data.yaml"
221+
ref: main
222+
pipeline: pipeline_name
223+
variables:
224+
MY_VARIABLE: some_value
225+
ANOTHER_VAR: other_value
226+
```
227+
- Send a POST request to the middleware (replace `repository_id` and `webhook_secret`)
228+
```console
229+
$ curl -X POST -u 'repository_id:webhook_secret' --data-binary @data.yaml https://cicd-ext-mw.cscs.ch/ci/pipeline/trigger
230+
```
231+
- To trigger a pull-request use `ref: 'pr:<pr-number>'`
232+
- To trigger a tag use `ref: 'tag:<tag-name>'`
233+
- To trigger on a specific commit SHA use `ref: 'sha:<commit-sha>'`
232234
233235
### Understanding the underlying workflow
234236
Typical users do not need to know the underlying workflow behind the scenes, so you can stop reading here.

docs/software/container-engine/run.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ There are three ways to do so:
1515
$ srun --environment=./.edf/ubuntu.toml echo "Hello" # from ${HOME}.
1616
```
1717

18-
3. **From EDF search paths**: the name of EDF in the [EDF search path][ref-ce-edf-search-path]. `--environment` also accepts the EDF filename without the `.toml` extension:
18+
3. **From EDF search paths**: the name of EDF in the [EDF search path][ref-ce-edf-search-path]. Notice that in this way, `--environment` accepts the EDF filename **without** the `.toml` extension.
1919

2020
```console
2121
$ srun --environment=ubuntu echo "Hello"

docs/software/devtools/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,6 @@ This ensures that computational resources are utilized to their fullest potentia
2828
Learning to analyze the performance of an applications effectively is crucial to build a deeper understanding of how your code interacts with the underlying hardware.
2929
In this section we introduce the various performance analysis solutions available at CSCS.
3030

31-
* [Linaro Forge MAP][ref-devtools-map]
3231
* [NVIDIA Nsight Developer Tools][ref-devtools-nsight]
33-
32+
* [Linaro Forge MAP][ref-devtools-map]
33+
* [VI-HPS Tools][ref-devtools-vihps]

docs/software/devtools/vihps.md

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
[](){#ref-devtools-vihps}
2+
# VI-HPS tools
3+
4+
The [VI-HPS](https://www.vi-hps.org/tools) Institute (Virtual Institute for High Productivity Supercomputing) provides tools that can assist developers of simulation codes to address their needs in performance analysis.
5+
6+
## [Score-P](https://www.vi-hps.org/projects/score-p/overview/overview.html)
7+
8+
[Score-P](https://www.vi-hps.org/projects/score-p/overview/overview.html)
9+
is a highly scalable instrumentation and measurement infrastructure for profiling, event tracing, and online analysis. It supports a wide range of HPC platforms and programming models. Score-P provides core measurement services for a range of specialized analysis tools, such as Vampir, Scalasca and others.
10+
11+
## [Vampir](https://www.vi-hps.org/tools/vampir.html)
12+
13+
[Vampir](https://www.vi-hps.org/tools/vampir.html)
14+
is a performance visualizer that allows to quickly study the program runtime behavior at a fine level of details. This includes the display of detailed performance event recordings over time in timelines and aggregated profiles. Interactive navigation and zooming are the key features of the tool, which help to quickly identify inefficient or faulty parts of a program.
15+
16+
!!! info
17+
While Score-P does not require a license, [Vampir](https://vampir.eu/licensing) does. CSCS standard license allows to read trace files with up to 256 concurrent threads of execution.
18+
19+
The Vampir GUI is currently available only on `x86-64` CPU based systems and is not provided via a uenv (more details in the Quickstart guide below).
20+
You can use Score-P to generate OTF2 traces files on Alps compute nodes and then visualize the results with Vampir on a x86-64 CPU based system (for instance Eiger, LUMI or using your own license).
21+
22+
## [Cube and Scalasca](http://www.vi-hps.org/tools/scalasca.html)
23+
24+
[Cube and Scalasca](http://www.vi-hps.org/tools/scalasca.html)
25+
support the performance optimization of parallel programs with a collection of scalable trace-based tools for in-depth analyses of concurrent behavior. The analysis identifies potential performance bottlenecks - in particular those concerning communication and synchronization - and offers guidance in exploring their causes.
26+
27+
## Quickstart guide
28+
29+
The VI-HPS uenv is named `scorep` and it can be loaded into your environment as explained here and in the [uenv documentation][ref-uenv].
30+
31+
!!! example "Finding and pulling available `scorep` versions"
32+
33+
```console
34+
uenv image find scorep
35+
# uenv arch system id size(MB) date
36+
# scorep/9.2-gcc12:v1 gh200 daint bfd3b46d30404f2c 7,602 2025-07-14
37+
# scorep/9.2-gcc13:v1 gh200 daint 3c0357a490c81f32 7,642 2025-07-14
38+
39+
uenv image pull scorep/9.2-gcc13:v1
40+
# pulling 3c0357a490c81f32 100.00%
41+
```
42+
43+
This uenv is configured to be mounted in the `/user-environment` path.
44+
45+
!!! example "Start the `scorep` uenv"
46+
47+
```bash
48+
uenv start scorep/9.2-gcc13:v1 -v default
49+
50+
uenv status # (1)!
51+
scorep --version # 9.2
52+
scalasca --version # 2.6.2
53+
cubelib-config --version # 4.9
54+
otf2-print --version # 3.1.1
55+
56+
find /user-environment/ -name scorep.pdf # (2)!
57+
```
58+
59+
1. Test that everything has been mounted correctly and that the tools are in the PATH
60+
2. A PDF version of the user guide is available in the uenv.
61+
62+
!!! example "Recompile your code with `scorep` on Alps"
63+
64+
```bash
65+
# Building with CMake requires the following steps
66+
## Invoke cmake with the scorep wrapper disabled:
67+
SCOREP_WRAPPER=OFF \
68+
cmake -S src -B build \
69+
-DCMAKE_CXX_COMPILER=scorep-mpic++ \
70+
-DCMAKE_C_COMPILER=scorep-mpicc \
71+
-DCMAKE_CUDA_COMPILER=scorep-nvcc \
72+
-DCMAKE_CUDA_ARCHITECTURES=90 # [...]
73+
74+
## Then build with the scorep wrapper enabled:
75+
SCOREP_WRAPPER=ON \
76+
cmake --build build
77+
```
78+
79+
!!! example "Run your application with `scorep` on Alps"
80+
81+
Pick one of the report type in your jobscript before running the executable compiled with `scorep`:
82+
83+
=== "Profiling"
84+
85+
- Profiling gives an overview of the performance of your simulation on Alps
86+
87+
```bash
88+
export SCOREP_ENABLE_PROFILING=true
89+
# Call-path profiling: CUBE4 data format (profile.cubex)
90+
```
91+
92+
- Then run your job as usual with `srun` or `sbatch` on Alps,
93+
- Copy the generated profile `profile.cubex` to your laptop,
94+
- Install the [Cube](https://www.scalasca.org/scalasca/software) tool on your laptop,
95+
- Analyze the results with the GUI:
96+
- performance metric (left panel)
97+
- call path (middle panel)
98+
- system resource (right panel)
99+
100+
```bash
101+
/Applications/Cube/4.9/Cube.app/Contents/MacOS/maccubegui.sh \
102+
./profile.cubex
103+
```
104+
105+
![sphexa_cube](../../images/devtools/vihps/sphexa_cube.png){ width="90%"}
106+
107+
=== "Tracing"
108+
109+
- Tracing allows a detailed analysis of the performance of your simulation on Alps
110+
111+
```bash
112+
export SCOREP_ENABLE_TRACING=true
113+
# Event-based tracing: OTF2 data format (traces.otf2)
114+
```
115+
116+
- Then run your job as usual with `srun` or `sbatch` on Alps,
117+
- Analyze the results with the GUI:
118+
119+
```bash
120+
ssh -X eiger.cscs.ch # Vampir GUI requires x86_64 ⚠️
121+
/capstor/store/cscs/userlab/vampir/10.6.1/bin/vampir \
122+
./traces.otf2
123+
```
124+
125+
!!! info
126+
- Tracing allows more detailed analysis but will also make your simulation run longer than with profiling,
127+
- `scorep-score` allows to estimate the size of an OTF2 tracefile from a CUBE profile,
128+
it can also help to reduce the overhead of tracing via filtering:
129+
```bash
130+
scorep-score -g profile.cubex # generate filter file
131+
scorep-score -f initial_scorep.filter profile.cubex
132+
export SCOREP_FILTERING_FILE='initial_scorep.filter'
133+
```
134+
- The [user guide](https://perftools.pages.jsc.fz-juelich.de/cicd/scorep/tags/scorep-9.2/html/group__SCOREP__User.html#gaab4b3ccc2b169320c1d3bf7fe19165f9) provides more details about how to reduce overhead.
135+
136+
![sphexa_vampir](../../images/devtools/vihps/sphexa_vampir.png){ width="90%"}

docs/software/sciapps/cp2k.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@ transition state optimization using NEB or dimer method. See [CP2K Features] for
1515
[CP2K] is provided on [ALPS][platforms-on-alps] via [uenv][ref-uenv].
1616
Please have a look at the [uenv documentation][ref-uenv] for more information about uenvs and how to use them.
1717

18+
!!! warning "Known issues"
19+
Please check CP2K's [known issues](#known-issues) and whether they are relevant to your work. They may impact your calculations in subtle ways, potentially leading to a waste of resources.
20+
1821
??? note "Changelog"
1922

2023
??? note "2025.1"
@@ -422,6 +425,20 @@ See [manual.cp2k.org/CMake] for more details.
422425

423426
## Known issues
424427

428+
### Older uenv versions on Eiger
429+
430+
After the migration to Eiger.Alps, calculations relying on older uenv versions (`2024.1:v1`, `2024.2:v1`, `2024.3:v1`) sometimes crash unexpectedly with a segmentation fault. The problem has been identify as coming from the `libxsmm` library, used as a backend in DBCSR. To avoid this issue, it is recommended to upgrade to a newer uenv.
431+
432+
In case a specific `2024.x` version of CP2K is required, crashes can be avoided by switching to the `BLAS` backend of DBCSR. This can be done by adding the following in the `&GLOBAL` subsection of the input file:
433+
434+
```bash
435+
&GLOBAL
436+
&DBCSR
437+
MM_DRIVER BLAS
438+
&END DBCSR
439+
&END GLOBAL
440+
```
441+
425442
### DLA-Future
426443

427444
The `cp2k/2025.1:v2` uenv provides CP2K with [DLA-Future] support enabled, in the `cp2k-dlaf` view.

0 commit comments

Comments
 (0)