From 8add4cbf55079e6bb6038af54c91c8029e464925 Mon Sep 17 00:00:00 2001 From: bcumming Date: Tue, 1 Jul 2025 08:44:58 +0200 Subject: [PATCH 1/3] link eiger docs to slurm docs --- docs/clusters/eiger.md | 21 +++++++-------------- docs/running/slurm.md | 3 ++- 2 files changed, 9 insertions(+), 15 deletions(-) diff --git a/docs/clusters/eiger.md b/docs/clusters/eiger.md index ff7fd777..7121ee6c 100644 --- a/docs/clusters/eiger.md +++ b/docs/clusters/eiger.md @@ -3,11 +3,10 @@ Eiger is an Alps cluster that provides compute nodes and file systems designed to meet the needs of CPU-only workloads for the [HPC Platform][ref-platform-hpcp]. -!!! under-construction - This documentation is for the updated cluster `Eiger.Alps` reachable at `eiger.alps.cscs.ch`, that has replaced the former cluster as of June 30 2025. - The previous [Eiger User Guide](https://confluence.cscs.ch/spaces/KB/pages/284426490/Alps+Eiger+User+Guide) is still available on the legacy Knowledge Base. +!!! note + This documentation is for the updated cluster `Eiger.Alps` reachable at `eiger.alps.cscs.ch`, that replaced the former cluster as on July 1 2025. -!!! change "Important changes" +??? change "Important changes from Eiger" The redeployment of `eiger.cscs.ch` as `eiger.alps.cscs.ch` has introduced changes that may affect some users. ### Breaking changes @@ -42,16 +41,10 @@ Eiger is an Alps cluster that provides compute nodes and file systems designed t ### Compute nodes -Eiger consists of multicore [AMD Epyc Rome][ref-alps-zen2-node] compute nodes: please note that the total number of available compute nodes on the system might vary over time, therefore you might want to check them with the Slurm command `sinfo -s`. -``` -PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST -debug up 30:00 0/12/0/12 nid[002236-002247] -xfer up 1-00:00:00 0/4/0/4 nid[002232-002235] -prepost up 30:00 0/560/0/560 nid[001000-001023,001028-001031,001064-001127,001160-001191,001256-001267,001272-001287,001320-001447,001504-001539,001541-001543,001573-001599,001640-001767,001797-001799,001829-001831,002152-002231] -normal* up 1-00:00:00 0/560/0/560 nid[001000-001023,001028-001031,001064-001127,001160-001191,001256-001267,001272-001287,001320-001447,001504-001539,001541-001543,001573-001599,001640-001767,001797-001799,001829-001831,002152-002231] -low up 1-00:00:00 0/560/0/560 nid[001000-001023,001028-001031,001064-001127,001160-001191,001256-001267,001272-001287,001320-001447,001504-001539,001541-001543,001573-001599,001640-001767,001797-001799,001829-001831,002152-002231] -``` -Additionally, there are four login nodes with hostnames `eiger-ln00[1-4]`: . +Eiger consists of multicore [AMD Epyc Rome][ref-alps-zen2-node] compute nodes: please note that the total number of available compute nodes on the system might vary over time. +See the [Slurm documentation][ref-slurm-partitions-nodecount] for information on how to check the number of nodes. + +Additionally, there are four login nodes with hostnames `eiger-ln00[1-4]`. ### Storage and file systems diff --git a/docs/running/slurm.md b/docs/running/slurm.md index 2aef9c37..98b03b09 100644 --- a/docs/running/slurm.md +++ b/docs/running/slurm.md @@ -43,7 +43,7 @@ $ sbatch --account=g123 ./job.sh ``` !!! note - The flags `--account` and `-Cmc` that were required on the old Eiger cluster are no longer required. + The flags `--account` and `-Cmc` that were required on the old [Eiger][ref-cluster-eiger] cluster are no longer required. ## Prioritization and scheduling @@ -66,6 +66,7 @@ Each type of node has different resource constraints and capabilities, which Slu For example, CPU-only nodes may have configurations optimized for multi-threaded CPU workloads, while GPU nodes require additional parameters to allocate GPU resources efficiently. Slurm ensures that user jobs request and receive the appropriate resources while preventing conflicts or inefficient utilization. +[](){#ref-slurm-partitions-nodecount} !!! example "How to check the partitions and number of nodes therein?" You can check the size of the system by running the following command in the terminal: ```console From 5efe381805bda81983cefa03957970392c22dfa9 Mon Sep 17 00:00:00 2001 From: lucamar Date: Tue, 1 Jul 2025 09:14:50 +0200 Subject: [PATCH 2/3] Update eiger.md Fix information on FirecREST --- docs/clusters/eiger.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/clusters/eiger.md b/docs/clusters/eiger.md index 7121ee6c..5cbe8195 100644 --- a/docs/clusters/eiger.md +++ b/docs/clusters/eiger.md @@ -28,10 +28,10 @@ Eiger is an Alps cluster that provides compute nodes and file systems designed t ### Unimplemented features - !!! under-construction "Jupyter and FirecREST is not yet available" - [Jupyter and FirecREST][ref-firecrest] have not been configured on `Eiger.Alps`. + !!! under-construction "Jupyter is not yet available" + [Jupyter][ref-jupyter] has not yet been configured on `Eiger.Alps`. - **They will be deployed as soon as possible and this documentation will be updated accordingly** + **It will be deployed as soon as possible and this documentation will be updated accordingly** ### Minor changes @@ -161,9 +161,9 @@ See the Slurm documentation for instructions on how to run jobs on the [AMD CPU ### Jupyter and FirecREST !!! under-construction "FirecREST is not yet available" - [Jupyter and FirecREST][ref-firecrest] have not been configured on `Eiger.Alps`. + [Jupyter][ref-jupyter] has not yet been configured on `Eiger.Alps`. - **They will be deployed as soon as possible and this documentation will be updated accordingly** + **It will be deployed as soon as possible and this documentation will be updated accordingly** ## Maintenance and status From d64d925d63b155e04839a308e13dbbeab142371d Mon Sep 17 00:00:00 2001 From: bcumming Date: Tue, 1 Jul 2025 09:33:11 +0200 Subject: [PATCH 3/3] link daint docs to slurm docs --- docs/clusters/daint.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/clusters/daint.md b/docs/clusters/daint.md index 26aae50a..d7b84541 100644 --- a/docs/clusters/daint.md +++ b/docs/clusters/daint.md @@ -10,6 +10,7 @@ Daint is the main [HPC Platform][ref-platform-hpcp] cluster that provides comput Daint consists of around 800-1000 [Grace-Hopper nodes][ref-alps-gh200-node]. The number of nodes can vary as nodes are added or removed from other clusters on Alps. +See the [Slurm documentation][ref-slurm-partitions-nodecount] for information on how to check the number of nodes. There are four login nodes, `daint-ln00[1-4]`. You will be assigned to one of the four login nodes when you ssh onto the system, from where you can edit files, compile applications and launch batch jobs. @@ -112,8 +113,6 @@ There are four [Slurm partitions][ref-slurm-partitions] on the system: * the `xfer` partition is for [internal data transfer][ref-data-xfer-internal]. * the `low` partition is a low-priority partition, which may be enabled for specific projects at specific times. - - | name | nodes | max nodes per job | time limit | | -- | -- | -- | -- | | `normal` | unlim | - | 24 hours |