You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -425,24 +426,32 @@ will fail the cluster creation process because Vertex AI Tensorboard is not supp
425
426
--tpu-type=v5litepod-16
426
427
```
427
428
428
-
## Provisioning A3-Ultra and A3-Mega clusters (GPU machines)
429
-
To create a cluster with A3 machines, run the below command. To create workloads on these clusters see [here](#workloads-for-a3-ultra-and-a3-mega-clusters-gpu-machines).
430
-
* For A3-Ultra: --device-type=h200-141gb-8
431
-
* For A3-Mega: --device-type=h100-mega-80gb-8
429
+
## Provisioning A3 Ultra, A3 Mega and A4 clusters (GPU machines)
430
+
To create a cluster with A3 or A4 machines, run the command below with selected device type. To create workloads on these clusters see [here](#workloads-for-a3-ultra-a3-mega-and-a4-clusters-gpu-machines).
Currently, the below flags/arguments are supported for A3 Mega, A3 Ultra and A4 machines:
449
+
*`--num-nodes`
450
+
*`--default-pool-cpu-machine-type`
451
+
*`--default-pool-cpu-num-nodes`
452
+
*`--reservation`
453
+
*`--spot`
454
+
*`--on-demand` (A3 Mega only)
446
455
447
456
448
457
## Storage
@@ -662,21 +671,27 @@ increase this to a large number, say 50. Real jobs can be interrupted due to
662
671
hardware failures and software updates. We assume your job has implemented
663
672
checkpointing so the job restarts near where it was interrupted.
664
673
665
-
### Workloads for A3-Ultra and A3-Mega clusters (GPU machines)
666
-
To submit jobs on a cluster with A3 machines, run the below command. To create a cluster with A3 machines see [here](#provisioning-a3-ultra-and-a3-mega-clusters-gpu-machines).
667
-
* For A3-Ultra: --device-type=h200-141gb-8
668
-
* For A3-Mega: --device-type=h100-mega-80gb-8
674
+
### Workloads for A3 Ultra, A3 Mega and A4 clusters (GPU machines)
675
+
To submit jobs on a cluster with A3 or A4 machines, run the command with selected device type. To create a cluster with A3 or A4 machines see [here](#provisioning-a3-ultra-a3-mega-and-a4-clusters-gpu-machines).
0 commit comments