Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 32 additions & 9 deletions docs/alps/hardware.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,20 +40,41 @@ Alps was installed in phases, starting with the installation of 1024 AMD Rome du

There are currently four node types in Alps, with another becoming available in 2025:

| type | blades | nodes | CPU sockets | GPU devices |
| ---- | ------:| -----:| -----------:| -----------:|
| NVIDIA GH200 | 1344 | 2688 | 10,752 | 10,752 |
| AMD Rome | 256 | 1024 | 2,048 | -- |
| NVIDIA A100 | 72 | 144 | 144 | 576 |
| AMD MI250x | 12 | 24 | 24 | 96 |
| AMD MI300A | 64 | 128 | 512 | 512 |
| type | abbreviation | blades | nodes | CPU sockets | GPU devices |
| ---- | ------- | ------:| -----:| -----------:| -----------:|
| NVIDIA GH200 | gh200 | 1344 | 2688 | 10,752 | 10,752 |
| AMD Rome | zen2 | 256 | 1024 | 2,048 | -- |
| NVIDIA A100 | a100 | 72 | 144 | 144 | 576 |
| AMD MI250x | mi200 | 12 | 24 | 24 | 96 |
| AMD MI300A | mi300 | 64 | 128 | 512 | 512 |

[](){#ref-alps-gh200-node}
### NVIDIA GH200 GPU Nodes

!!! todo
!!! under-construction
The description of the GH200 nodes is incomplete.

Let us know if there is missing information.

There are 24 cabinets, in 4 rows with 6 cabinets per row:

* 8 chassis per cabinet
* 7 blades per chassis
* a chassis can contain up to 8 blades, however Alps' gh200 chassis are underpopulated so that we can increase the amount of power delivered to each node.
* 2 nodes per blade

Blanca Peak
Each node contains four Grace-Hopper modules and four corresponding network interface cards (NICS) per blade, as illustrated below:

![](../images/alps/gh200-schematic.svg)

??? info "node xnames"
There are two boards per blade with one node per board.
This is different to the `zen2` CPU-only nodes (used for example in Eiger) that had two nodes per board for a total of four nodes per blade.
As such, there are no `n1` nodes in the xname list, e.g.:
```
x1100c0s6b0n0
x1100c0s6b1n0
```

[](){#ref-alps-zen2-node}
### AMD Rome CPU Nodes
Expand All @@ -79,6 +100,8 @@ Bard Peak
[](){#ref-alps-mi300-node}
### AMD MI300A GPU Nodes

![](../images/alps/mi300-schematic.svg)

!!! todo

Parry Peak
60 changes: 60 additions & 0 deletions docs/contributing/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,66 @@ They stand out better from the main text, and can be collapsed by default if nee

If an admonition is collapsed by default, it should have a title.

We provide some custom admonitions.

#### Change

For adding information about a change, originally designed for recording updates to clusters.

=== "Rendered"
!!! change "2025-04-17"
* SLURM was upgraded to version 25.1.
* uenv was upgraded to v0.8

Old changes can be folded:

??? change "2025-02-04"
* The new Scratch cleanup policy was implemented
* NVIDIA driver was updated

=== "Markdown"
```
!!! change "2025-04-17"
* SLURM was upgraded to version 25.1.
* uenv was upgraded to v0.8
```

Old changes can be folded:

```
??? change "2025-02-04"
* The new Scratch cleanup policy was implemented
* NVIDIA driver was updated
```

#### Under construction

For marking incomplete sections.

=== "Rendered"
!!! under-construction
This is not finished yet!

=== "Markdown"
```
!!! under-construction
This is not finished yet!
```

#### Todo

As a placeholder for documentation that needs to be written.

=== "Rendered"
!!! todo
Add some common error messages and how to fix them.

=== "Markdown"
```
!!! todo
Add some common error messages and how to fix them.
```

### Code blocks

Use [code blocks](https://squidfunk.github.io/mkdocs-material/reference/code-blocks/) when you want to display monospace text in a programming language, terminal output, configuration files etc.
Expand Down
Loading