You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/alps/hardware.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,7 +65,7 @@ There are 24 cabinets, in 4 rows with 6 cabinets per row, and each cabinet conta
65
65
!!! info "Why 7 blades per chassis?"
66
66
A chassis can contain up to 8 blades, however Alps' gh200 chassis are underpopulated so that we can increase the amount of power delivered to each GPU.
67
67
68
-
Each node contains four Grace-Hopper modules and four corresponding network interface cards (NICS) per blade, as illustrated below:
68
+
Each node contains four Grace-Hopper modules and four corresponding network interface cards (NICs) per blade, as illustrated below:
Copy file name to clipboardExpand all lines: docs/guides/mlp_tutorials/llm-inference.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ The model we will be running is Google's [Gemma-7B](https://huggingface.co/googl
12
12
13
13
## Gemma-7B Inference using NGC PyTorch
14
14
15
-
### Prequisites
15
+
### Prerequisites
16
16
17
17
This tutorial assumes you are able to access the cluster via SSH. To set up access to CSCS systems, follow the guide [here][ref-ssh], and read through the documentation about the [ML Platform][ref-platform-mlp].
where you should replace `<ACCOUNT>` with your project account ID.
78
78
At this point, you can exit the Slurm allocation by typing `exit`.
79
-
You should be able to see a new squashfile next to your Dockerfile:
79
+
You should be able to see a new squashfs file next to your Dockerfile:
80
80
81
81
```console
82
82
$ ls
83
83
Dockerfile pytorch-24.01-py3-ven.sqsh
84
84
```
85
85
86
-
This squashfile is essentially a compressed container image, which can be run directly by the container engine.
86
+
This squashfs file is essentially a compressed container image, which can be run directly by the container engine.
87
87
We will use our freshly-built container `pytorch-24.01-py3-venv.sqsh` in the following steps to run a PyTorch script that loads the Google Gemma-7B model and performs some inference with it.
88
88
89
89
### Set up an EDF
@@ -109,7 +109,7 @@ Make sure to replace `<USER>` with your actual CSCS username.
109
109
If you've decided to build the container somewhere else, make sure to supply the correct path to the `image` variable.
110
110
111
111
The `image` variable defines which container we want to load.
112
-
This could either be a container from an online docker repository, like `nvcr.io/nvidia/pytorch:24.01-py3`, or in our case, a local squashfile which we built ourselves.
112
+
This could either be a container from an online docker repository, like `nvcr.io/nvidia/pytorch:24.01-py3`, or in our case, a local squashfs file which we built ourselves.
113
113
114
114
The `mounts` variable defines which directories we want to mount where in our container.
115
115
In general, it's a good idea to use the scratch directory to store outputs from any scientific software.
@@ -278,7 +278,7 @@ Move on to the next tutorial or try the challenge.
278
278
279
279
### Challenge
280
280
281
-
Using the same approach as in the latter half of step 4, use pip to install the package `nvitop`. This is a tool that shows you a concise real-time summary of GPU activity. Then, run Gemma and launch nvitop at the same time:
281
+
Using the same approach as in the latter half of step 4, use pip to install the package `nvitop`. This is a tool that shows you a concise real-time summary of GPU activity. Then, run Gemma and launch `nvitop` at the same time:
0 commit comments