Skip to content

Commit b489566

Browse files
authored
Merge branch 'main' into expand-communication
2 parents 4b9a49c + dbb877a commit b489566

File tree

18 files changed

+290
-188
lines changed

18 files changed

+290
-188
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
* @bcumming @msimberg @RMeli
22
docs/services/firecrest @jpdorsch @ekouts
33
docs/software/communication @Madeeks @msimberg
4+
docs/software/devtools/linaro @jgphpc
45
docs/software/prgenv/linalg.md @finkandreas @msimberg
56
docs/software/sciapps/cp2k.md @abussy @RMeli

docs/access/mfa.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,8 @@ Before starting, ensure that the following pre-requisites are satisfied
5050

5151
!!! warning
5252
If you try to SSH to CSCS systems without setting up MFA, you will be prompted with permission denied error, for example:
53-
```
54-
> ssh ela.cscs.ch
53+
```console
54+
$ ssh ela.cscs.ch
5555
[email protected]: Permission denied (publickey).
5656
Connection closed by UNKNOWN port 65535
5757
```

docs/clusters/clariden.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,8 @@ See the SLURM documentation for instructions on how to run jobs on the [Grace-Ho
8484

8585
??? example "how to check the number of nodes on the system"
8686
You can check the size of the system by running the following command in the terminal:
87-
```terminal
88-
> sinfo --format "| %20R | %10D | %10s | %10l | %10A |"
87+
```console
88+
$ sinfo --format "| %20R | %10D | %10s | %10l | %10A |"
8989
| PARTITION | NODES | JOB_SIZE | TIMELIMIT | NODES(A/I) |
9090
| debug | 32 | 1-2 | 30:00 | 3/29 |
9191
| normal | 1266 | 1-infinite | 1-00:00:00 | 812/371 |

docs/clusters/santis.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -98,8 +98,8 @@ See the SLURM documentation for instructions on how to run jobs on the [Grace-Ho
9898

9999
??? example "how to check the number of nodes on the system"
100100
You can check the size of the system by running the following command in the terminal:
101-
```terminal
102-
> sinfo --format "| %20R | %10D | %10s | %10l | %10A |"
101+
```console
102+
$ sinfo --format "| %20R | %10D | %10s | %10l | %10A |"
103103
| PARTITION | NODES | JOB_SIZE | TIMELIMIT | NODES(A/I) |
104104
| debug | 32 | 1-2 | 30:00 | 3/29 |
105105
| normal | 1266 | 1-infinite | 1-00:00:00 | 812/371 |

docs/contributing/index.md

Lines changed: 61 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,20 +13,20 @@ We use the GitHub fork and pull request model for development:
1313
Clone your fork repository on your PC/laptop:
1414
```bash
1515
# clone your fork of the repository
16-
> git clone [email protected]:${githubusername}/cscs-docs.git
17-
> cd cscs-docs
18-
> git switch -c 'fix/ssh-alias'
16+
git clone [email protected]:${githubusername}/cscs-docs.git
17+
cd cscs-docs
18+
git switch -c 'fix/ssh-alias'
1919
# ... make your edits ...
2020
# add and commit your changes
21-
> git add <files>
22-
> git commit -m 'update the ssh docs with aliases for all user lab vclusters'
23-
> git push origin 'fix/ssh-alias'
21+
git add <files>
22+
git commit -m 'update the ssh docs with aliases for all user lab vclusters'
23+
git push origin 'fix/ssh-alias'
2424
```
2525
Then navigate to GitHub, and create a pull request.
2626

2727
The `serve` script in the root path of the repository can be used to view the docs locally:`
28-
```
29-
> ./serve
28+
```bash
29+
./serve
3030
...
3131
INFO - [08:33:34] Serving on http://127.0.0.1:8000/
3232
```
@@ -228,3 +228,56 @@ They stand out better from the main text, and can be collapsed by default if nee
228228
This note is collapsed, because it uses `???`.
229229

230230
If an admonition is collapsed by default, it should have a title.
231+
232+
### Code blocks
233+
234+
Use [code blocks](https://squidfunk.github.io/mkdocs-material/reference/code-blocks/) when you want to display monospace text in a programming language, terminal output, configuration files etc.
235+
The documentation uses [pygments](https://pygments.org) for highlighting.
236+
See [list of available lexers](https://pygments.org/docs/lexers/#) for the languages that you can use for code blocks.
237+
238+
Use [`console`](https://pygments.org/docs/lexers/#pygments.lexers.shell.BashSessionLexer) for interactive sessions with prompt-output pairs:
239+
240+
=== "Markdown"
241+
242+
````markdown
243+
```console title="Hello, world!"
244+
$ echo "Hello, world!"
245+
Hello, world!
246+
```
247+
````
248+
249+
=== "Rendered"
250+
251+
```console title="Hello, world!"
252+
$ echo "Hello, world!"
253+
Hello, world!
254+
```
255+
256+
!!! warning
257+
`terminal` is not a valid lexer, but MkDocs or pygments will not warn about using it as a language.
258+
The text will be rendered without highlighting.
259+
260+
!!! warning
261+
Use `$` as the prompt character, optionally preceded by text.
262+
`>` as the prompt character will not be highlighted correctly.
263+
264+
Note the use of `title=...`, which will give the code block a heading.
265+
266+
!!! tip
267+
Include a title whenever possible to describe what the code block does or is.
268+
269+
If you want to display commands without output that can easily be copied, use `bash` as the language:
270+
271+
=== "Markdown"
272+
273+
````markdown
274+
```bash title="Hello, world!"
275+
echo "Hello, world!"
276+
```
277+
````
278+
279+
=== "Rendered"
280+
281+
```bash title="Hello, world!"
282+
echo "Hello, world!"
283+
```

docs/guides/gb2025.md

Lines changed: 24 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[](){#ref-gb2025}
22
# Gordon Bell and HPL runs 2025
33

4-
For Gordon Bell and HPL runs in March-April 2025, CSCS has created a reservation on Santis with 1333 nodes (12 cabinets).
4+
For Gordon Bell and HPL runs in March-April 2025, CSCS has expanded Santis to 1333 nodes (12 cabinets).
55

66
For the runs, CSCS has applied some updates and changes that aim to improve performance and scaling scale, particularly for NCCL.
77
If you are already familiar with running on Daint, you might have to make some small changes to your current job scripts and parameters, which will be documented here.
@@ -27,6 +27,18 @@ Host santis
2727

2828
The `normal` partition is used with no reservation, which means that that jobs can be submittied without `--partition` and `--reservation` flags.
2929

30+
Timeline:
31+
32+
1. Friday 4th April:
33+
* HPE finish HPL runs at 10:30am
34+
* CSCS performs testing on the reconfigured system for ~1 hour on the `GB_TESTING_2` reservation
35+
* The reservation is removed and all GB teams have access to test and tune applications.
36+
2. Monday 7th April:
37+
* at 4pm the runs will start for the first team
38+
39+
!!! note
40+
There will be no special reservation during the open testing and tuning between Friday and Monday.
41+
3042
### Storage
3143

3244
Your data sets from Daint are available on Santis
@@ -37,51 +49,34 @@ Your data sets from Daint are available on Santis
3749

3850
## Low Noise Mode
3951

40-
Low noise mode (LNM) is now enabled.
41-
This confines system processes and operations to the first core of each of the four NUMA regions in a node (i.e., cores 0, 72, 144, 216).
52+
!!! note
53+
Low noise mode has been relaxed, so the previous requirement that you set `OMP_PLACES` and `OMP_PROC_BIND` no longer applies.
54+
One core per module is still reserved for system processes.
55+
56+
Santis uses low noise mode, which reserves one core per Grace-Hopper module (i.e. per 72 cores) for system processes.
57+
This mode is intended to reduce performance variability caused by system processes interfering with application threads and processes.
58+
This means that SLURM job scripts must be updated to account for the reserved cores.
4259

43-
The consequence of this setting is that only 71 cores per socket can be requested by an application (for a total of 284 cores instead of 288 cores per node).
60+
### SLURM
4461

45-
!!! warning "Unable to allocate resources: Requested node configuration is not availabl"
62+
!!! warning "Unable to allocate resources: Requested node configuration is not available"
4663
If you try to use all 72 cores on each socket, SLURM will give a hard error, because only 71 are available:
4764

48-
```
65+
```console
4966
# try to run 4 ranks per node, with 72 cores each
50-
> srun -n4 -N1 -c72 --reservation=reshuffling ./build/affinity.mpi
67+
$ srun -n4 -N1 -c72 ./build/affinity.mpi
5168
srun: error: Unable to allocate resources: Requested node configuration is not available
5269
```
5370

54-
One consequence of this change is that thread affinity and OpenMP settings that worked on Daint might cause large slowdown in the new configuration.
55-
56-
### SLURM
57-
5871
Explicitly set the number of cores per task using the `--cpus-per-task/-c` flag, e.g.:
5972
```
6073
#SBATCH --cpus-per-task=64
61-
#SBATCH --cpus-per-task=71
6274
```
6375
or
6476
```
6577
srun -N1 -n4 -c71 ...
6678
```
6779

68-
**Do not** use the `--cpu-bind` flag to control affinity
69-
70-
* this can cause large slowdown, particularly with `--cpu-bind=socket`. We are investigating how to fix this.
71-
72-
If you see significant slowdown and you want to report it, please provide the output of using the `--cpu-bind=verbose` flag.
73-
74-
### OpenMP
75-
76-
If your application uses OpenMP, try setting the following in your job script:
77-
78-
```bash
79-
export OMP_PLACES=cores
80-
export OMP_PROC_BIND=close
81-
```
82-
83-
Without these settings, we have observed application slowdown due to poor thread placement.
84-
8580
## NCCL
8681

8782
!!! todo

docs/guides/terminal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ At CSCS the vast majority of users stick with the default `bash`: at the time of
1414

1515
Run the following command after logging in:
1616

17-
```terminal
17+
```console
1818
$ getent passwd | grep $USER
1919
bcumming:*:22008:1000:Benjamin Cumming, CSCS:/users/bcumming:/usr/local/bin/bash
2020
```

docs/software/container-engine.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -50,13 +50,13 @@ Since the `ubuntu.toml` file is located in the [EDF search path][ref-ce-edf-sear
5050
The above terminal snippet demonstrates how to launch a containerized environment using Slurm with the `--environment` option.
5151
Click on the :fontawesome-solid-circle-plus: icon for information on each command.
5252

53-
```bash
54-
daint-ln002 > srun --environment=ubuntu --pty bash # (1)
53+
```console
54+
[daint-ln002]$ srun --environment=ubuntu --pty bash # (1)
5555

56-
nid005333 > pwd # (2)
56+
[nid005333]$ pwd # (2)
5757
/capstor/scratch/cscs/<username>
5858

59-
nid005333 > cat /etc/os-release # (3)
59+
[nid005333]$ cat /etc/os-release # (3)
6060
PRETTY_NAME="Ubuntu 24.04 LTS"
6161
NAME="Ubuntu"
6262
VERSION_ID="24.04"
@@ -71,8 +71,8 @@ Since the `ubuntu.toml` file is located in the [EDF search path][ref-ce-edf-sear
7171
UBUNTU_CODENAME=noble
7272
LOGO=ubuntu-logo
7373

74-
nid005333 > exit # (4)
75-
daint-ln002 >
74+
[nid005333]$ exit # (4)
75+
[daint-ln002]$
7676
```
7777

7878
1. Starting an interactive shell session within the Ubuntu 24.04 container deployed on a compute node using `srun --environment=ubuntu --pty bash`.

docs/software/devtools/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
[](){#ref-software-devtools}
22
# Debugging and Performance Analysis tools
33

4-
Debugging and Performance Analysis tools can assist users in developing and optimizing scientific parallel applications, especially in a high-performance computing (HPC) environment.
5-
Efficient tools can significantly improve workflows and save valuable computational resources.
4+
Debugging and performance analysis tools can assist users in developing and optimizing scientific parallel applications, especially in a high-performance computing (HPC) environment.
5+
These tools can significantly improve workflows and save valuable computational resources.
66

7-
CSCS provides debuggers and performance analysis tools on Alps Clusters.
7+
CSCS provides debuggers and performance analysis tools on [Alps][ref-alps] clusters.
88

9-
!!! note "get in touch"
9+
!!! note "Get in touch"
1010
If you have issues or questions about debugging or performance analysis tools, please do not hesitate to [contact us][ref-get-in-touch].
1111

1212
[](){#ref-devtools-debug}

docs/software/devtools/linaro-ddt.md

Lines changed: 24 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
# Linaro DDT
33

44
DDT allows source-level debugging of Fortran, C, C++ and Python codes.
5-
It can be used for debugging serial, multi-threaded (OpenMP), multi-process (MPI) and accelerated (CUDA, OpenACC) programs running on research and production systems, including the CSCS Alps system.
5+
It can be used for debugging serial, multi-threaded (OpenMP), multi-process (MPI), and accelerated (CUDA, OpenACC) programs running on research and production systems, including the CSCS [Alps][ref-alps] system.
66
DDT can be executed either with its graphical user interface or from the command-line.
77

88
!!! note
9-
Linaro DDT is provided in the `linaro-forge` uenv.
10-
Before using DDT, please read the [`linaro-forge` documentation][ref-uenv-linaro], which explains how to download and set up the latest version and set it up.
9+
Linaro DDT is provided in the `linaro-forge` [uenv][ref-uenv].
10+
Before using DDT, please read the [`linaro-forge` uenv documentation][ref-uenv-linaro], which explains how to download and set up the latest version.
1111

1212
## User guide
1313

@@ -18,7 +18,7 @@ The following guide will walk through the steps required to build and debug an a
1818
Once the uenv is loaded and activated, the program to debug must be compiled with the `-g` (for CPU) and `-G` (for GPU) debugging flags.
1919
For example, we can build a CUDA test with a user environment:
2020

21-
```terminal
21+
```bash
2222
uenv start prgenv-gnu:24.11:v1 --view=default
2323
nvcc -c -arch=sm_90 -g -G test_gpu.cu
2424
mpicxx -g test_cpu.cpp test_gpu.o -o myexe
@@ -27,29 +27,33 @@ mpicxx -g test_cpu.cpp test_gpu.o -o myexe
2727
### Launch Linaro DDT
2828

2929
To use the DDT client with uenv, it must be launched in `Manual Launch` mode
30-
(assuming that it is connected to Alps via `Remote Launch`):
30+
(assuming that it is connected to [Alps][ref-alps] via `Remote Launch`):
3131

32-
=== "on local machine"
32+
=== "On local machine"
3333

3434
Start DDT, and connect to the target cluster using the drop down menu for `Remote Launch`.
35+
If you don't have a target cluster,
36+
the [`linaro-forge` uenv documentation][ref-uenv-linaro] explains how to set up the connection the first time.
3537

36-
Click on `Manual launch`, set the number of processes to listen to, then wait for the slurm job to start (see the "on Alps" tab).
38+
Click on `Manual launch`, set the number of processes to listen to, then wait for the Slurm job to start
39+
(see the "on Alps" tab for how to start the Slurm job).
3740

3841
<img src="https://raw.githubusercontent.com/jgphpc/cornerstone-octree/ddt/scripts/img/ddt/0.png" width="600" />
3942

40-
=== "on Alps"
43+
=== "On Alps"
4144

4245
Log into the system and launch with the `srun` command:
4346

44-
```terminal
45-
# start a session with both the PE used to build your application
46-
# and the linaro-forge uenv mounted
47-
> uenv start prgenv-gnu/24.11:v1,linaro-forge/24.1.1:v1 --view=prgenv-gnu:default
48-
> source /user-tools/activate
49-
50-
> srun -N1 -n4 -t15 -pdebug ./cuda_visible_devices.sh ddt-client ./myexe
47+
```console
48+
$ uenv start prgenv-gnu/24.11:v1,linaro-forge/24.1.1:v1 --view=prgenv-gnu:default # (1)!
49+
$ source /user-tools/activate
50+
$ srun -N1 -n4 -t15 -pdebug ./cuda_visible_devices.sh ddt-client ./myexe
5151
```
5252

53+
1. Start a session with both the uenv used to build your application and the `linaro-forge` uenv mounted.
54+
55+
56+
5357
### Start debugging
5458

5559
By default, DDT will pause execution on the call to `MPI_Init`:
@@ -65,14 +69,16 @@ There are two mechanisms for controlling program execution:
6569

6670
=== "Stop at"
6771

68-
Execution can be paused in every CUDA kernel launch by activating the default breakpoints from the Control menu:
72+
Execution can be paused in every CUDA kernel launch by activating the default breakpoints from the `Control` menu:
6973

7074
<img src="https://raw.githubusercontent.com/jgphpc/cornerstone-octree/ddt/scripts/img/ddt/4.png" width="400" />
7175

7276

73-
This screenshot shows a debugging session on 128 gpus:
77+
??? example "Debugging with 128 GPUs"
78+
79+
This screenshot shows a debugging session on 128 GPUs:
7480

75-
![DDTgpus](https://raw.githubusercontent.com/jgphpc/cornerstone-octree/ddt/scripts/img/ddt/5.png)
81+
![DDTgpus](https://raw.githubusercontent.com/jgphpc/cornerstone-octree/ddt/scripts/img/ddt/5.png)
7682

7783
More informations regarding how to use Linaro DDT are provided in the Forge [User Guide](https://docs.linaroforge.com/latest/html/forge/index.html).
7884

0 commit comments

Comments
 (0)