You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Then open the following URL in your web browser: `http://localhost:8008/sdk-gui/`
@@ -114,8 +114,8 @@ pip install --upgrade pip
114
114
115
115
**Install SDK Packages:** Install the `cerebras_appliance` and `cerebras_sdk` Python packages in the virtual environment, specifying the appropriate Cerebras Software release:
Note: to access any external web resources from a Cerebras user node, you will need to have a proxy environment variable set (or equivalent). `wget` needs the lower-case proxy environment variable.
@@ -43,17 +43,17 @@ To run Unet with the <a href="https://www.kaggle.com/c/severstal-steel-defect-de
43
43
First, source a Cerebras PyTorch virtual environment.
44
44
45
45
```console
46
-
source ~/R_2.6.0/venv_cerebras_pt/bin/activate
46
+
source ~/R_2.9.0/venv_cerebras_pt/bin/activate
47
47
```
48
48
49
49
Then
50
50
51
51
```console
52
-
cd ~/R_2.6.0/modelzoo/src/cerebras/modelzoo/models/nlp/bert
52
+
cd ~/R_2.9.0/modelzoo/src/cerebras/modelzoo/models/nlp/bert
Note: the vocabulary file referenced in `/software/cerebras/dataset/bert_large/bert_large_MSL128_sampleds.yaml` is the same as the one at `/home/$(whoami)/R_2.6.0/modelzoo/src/cerebras/modelzoo/models/vocab/google_research_uncased_L-12_H-768_A-12.txt`.
107
+
Note: the vocabulary file referenced in `/software/cerebras/dataset/bert_large/bert_large_MSL128_sampleds.yaml` is the same as the one at `/home/$(whoami)/R_2.9.0/modelzoo/src/cerebras/modelzoo/models/vocab/google_research_uncased_L-12_H-768_A-12.txt`.
108
108
109
109
The last parts of the output should resemble the following, with messages about cuda that should be ignored and are not shown.
@@ -185,7 +185,7 @@ cszoo fit configs/params_llama2_7b.yaml --job_labels name=llama2_7b --model_dir
185
185
Note: the validation has been commented out of the yaml to decrease the run time of this sample. To run validation, uncomment the validation sections at the end of `configs/params_llama2_7b.yaml`.
Note: the validation has been commented out of the yaml to decrease the run time of this sample. To run validation, uncomment the validation sections at the end of `configs/params_esm2_t12_35M_UR50D_modified.yaml`.
2025-10-10 23:46:01,812 INFO: Training completed successfully!
276
-
2025-10-10 23:46:01,861 INFO: Processed 819200 training sample(s) in 4049.286902367 seconds.
276
+
2025-10-10 23:46:01,861 INFO: Processed 819200 training sample(s) in 4049.286902367 seconds
277
277
```
278
278
279
279
## Vision Transformer
280
280
The cerebras transformer based vision classifier model implementation can be found at `modelzoo/models/vision/vision_transformer`. Configs for base and huge model of the vision transformer can be found at `modelzoo/models/vision/vision_transformer/configs`. This examples uses the ImageNet dataset preprocessed at path `/software/datasets/imagenet/`.
281
281
282
282
First, source a Cerebras PyTorch virtual environment.
283
283
```bash
284
-
source ~/R_2.6.0/venv_cerebras_pt/bin/activate
284
+
source ~/R_2.9.0/venv_cerebras_pt/bin/activate
285
285
```
286
286
Instructions for training (for 400 steps):
287
287
```bash
288
-
cd ~/R_2.6.0/modelzoo/src/cerebras/modelzoo/models/vision/vision_transformer
288
+
cd ~/R_2.9.0/modelzoo/src/cerebras/modelzoo/models/vision/vision_transformer
Note: the validation has been commented out of the yaml to decrease the run time of this sample. To run validation, uncomment the validation sections at the end of `configs/params_vit_base_patch_16_imagenet_1k.yaml`.
@@ -345,20 +345,20 @@ The Cerebras Diffusion Transformer[[1](https://arxiv.org/pdf/2212.09748.pdf)] mo
345
345
346
346
First, source a Cerebras PyTorch virtual environment.
347
347
```bash
348
-
source ~/R_2.6.0/venv_cerebras_pt/bin/activate
348
+
source ~/R_2.9.0/venv_cerebras_pt/bin/activate
349
349
```
350
350
351
351
Instructions for training (for 400 steps):
352
352
```bash
353
-
cd ~/R_2.6.0/modelzoo/src/cerebras/modelzoo/models/vision/dit
353
+
cd ~/R_2.9.0/modelzoo/src/cerebras/modelzoo/models/vision/dit
Copy file name to clipboardExpand all lines: docs/ai-testbed/cerebras/index.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,15 +6,15 @@ The ALCF CS-3 Cerebras Wafer-Scale Cluster, is designed to support large-scale m
6
6
7
7
The Cerebras Wafer-Scale cluster is run as an appliance: a user submits a job to the appliance, and the appliance manages preprocessing and streaming of the data, IO, and device orchestration within the appliance. It provides programming via PyTorch. This installation supports Weight Streaming execution for models being pre-trained or fine-tuned.
8
8
9
-
The public Cerebras documentation is available [here](https://training-docs.cerebras.ai/rel-2.6.0/getting-started/overview).
9
+
The public Cerebras documentation is available [here](https://training-docs.cerebras.ai/rel-2.9.0/getting-started/overview).
10
10
11
11
A typical Cerebras Wafer-Scale Cluster is shown in the figure below. Users connect via SSH to the login node, `cerebras.alcf.anl.gov` and then ssh to a user node, using either `cer-usn-01` or `cer-usn-02`.
12
12
<!--- The rest of the nodes in the cluster infrastructure are not directly accessible, except by admins.-->
13
13
The trees `/home`, `/projects`, and `/software` are shared across the login nodes and user nodes, the relevant cluster infrastructure nodes, and all ALCF AI testbed platforms.
Figure: topology of CS-3 cluster ([source](https://training-docs.cerebras.ai/rel-2.6.0/concepts/cerebras-wafer-scale-cluster))
17
+
Figure: topology of CS-3 cluster ([source](https://training-docs.cerebras.ai/rel-2.9.0/concepts/cerebras-wafer-scale-cluster))
18
18
///
19
19
20
20
As indicated in the figure, which represent a CS-3 cluster with 4 CS-3 WSE, each of the CS-3 engines (marked at the right end corner of the figure) is responsible only for running and accelerating the computations for training and predictions with the model. The other work, including compilation, is performed on the input nodes, and the MemoryX nodes are used for weight storage and broadcast, and SwarmX nodes are used for gradient accumulation.
Copy file name to clipboardExpand all lines: docs/ai-testbed/cerebras/miscellaneous.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,12 +3,12 @@
3
3
## Porting applications to the CS-3
4
4
5
5
Cerebras documentation for porting code to run on a Cerebras CS-3 system:<br>
6
-
[Port Pytorch Models to Cerebras](https://training-docs.cerebras.ai/rel-2.6.0/model-zoo/migration/porting-pytorch-models-to-cerebras#port-pytorch-models-to-cerebras)
6
+
[Port Pytorch Models to Cerebras](https://training-docs.cerebras.ai/rel-2.9.0/model-zoo/migration/porting-pytorch-models-to-cerebras#port-pytorch-models-to-cerebras)
7
7
8
8
## Finetuning a model using CS-3s
9
9
10
10
The Cerebras tutorial for finetuning a model:<br>
11
-
[Fine-Tune Your First Model](https://training-docs.cerebras.ai/rel-2.6.0/getting-started/fine-tune-your-first-model)
11
+
[Fine-Tune Your First Model](https://training-docs.cerebras.ai/rel-2.9.0/getting-started/fine-tune-your-first-model)
0 commit comments