Skip to content

Commit d55e7e7

Browse files
authored
Merge pull request #42408 from jeana-redhat/OSDOCS-3278_GPU_on_GCP
OSDOCS-3278: GPU support for GCP
2 parents bc8f400 + cce0799 commit d55e7e7

File tree

2 files changed

+112
-0
lines changed

2 files changed

+112
-0
lines changed

machine_management/creating_machinesets/creating-machineset-gcp.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,5 @@ include::modules/machineset-non-guaranteed-instance.adoc[leveloffset=+1]
2121
include::modules/machineset-creating-non-guaranteed-instances.adoc[leveloffset=+1]
2222

2323
include::modules/machineset-enabling-customer-managed-encryption.adoc[leveloffset=+1]
24+
25+
include::modules/machineset-gcp-enabling-gpu-support.adoc[leveloffset=+1]
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * machine_management/creating_machinesets/creating-machineset-gcp.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="machineset-gcp-enabling-gpu-support_{context}"]
7+
= Enabling GPU support for a machine set
8+
9+
Google Cloud Platform (GCP) Compute Engine enables users to add GPUs to VM instances. Workloads that benefit from access to GPU resources can perform better on compute machines with this feature enabled. {product-title} on GCP supports NVIDIA GPU models in the A2 and N1 machine series.
10+
11+
.Supported GPU configurations
12+
|====
13+
|Model name |GPU type |Machine types ^[1]^
14+
15+
|NVIDIA A100
16+
|`nvidia-tesla-a100`
17+
a|* `a2-highgpu-1g`
18+
* `a2-highgpu-2g`
19+
* `a2-highgpu-4g`
20+
* `a2-highgpu-8g`
21+
* `a2-megagpu-16g`
22+
23+
|NVIDIA K80
24+
|`nvidia-tesla-k80`
25+
.5+a|* `n1-standard-1`
26+
* `n1-standard-2`
27+
* `n1-standard-4`
28+
* `n1-standard-8`
29+
* `n1-standard-16`
30+
* `n1-standard-32`
31+
* `n1-standard-64`
32+
* `n1-standard-96`
33+
* `n1-highmem-2`
34+
* `n1-highmem-4`
35+
* `n1-highmem-8`
36+
* `n1-highmem-16`
37+
* `n1-highmem-32`
38+
* `n1-highmem-64`
39+
* `n1-highmem-96`
40+
* `n1-highcpu-2`
41+
* `n1-highcpu-4`
42+
* `n1-highcpu-8`
43+
* `n1-highcpu-16`
44+
* `n1-highcpu-32`
45+
* `n1-highcpu-64`
46+
* `n1-highcpu-96`
47+
48+
|NVIDIA P100
49+
|`nvidia-tesla-p100`
50+
51+
|NVIDIA P4
52+
|`nvidia-tesla-p4`
53+
54+
|NVIDIA T4
55+
|`nvidia-tesla-t4`
56+
57+
|NVIDIA V100
58+
|`nvidia-tesla-v100`
59+
60+
|====
61+
[.small]
62+
--
63+
1. For more information about machine types, including specifications, compatibility, regional availability, and limitations, see the GCP Compute Engine documentation about link:https://cloud.google.com/compute/docs/general-purpose-machines#n1_machines[N1 machine series], link:https://cloud.google.com/compute/docs/accelerator-optimized-machines#a2_vms[A2 machine series], and link:https://cloud.google.com/compute/docs/gpus/gpu-regions-zones#gpu_regions_and_zones[GPU regions and zones availability].
64+
--
65+
66+
You can define which supported GPU to use for an instance by using the Machine API.
67+
68+
You can configure machines in the N1 machine series to deploy with one of the supported GPU types. Machines in the A2 machine series come with associated GPUs, and cannot use guest accelerators.
69+
70+
[NOTE]
71+
====
72+
GPUs for graphics workloads are not supported.
73+
====
74+
75+
.Procedure
76+
77+
. In a text editor, open the YAML file for an existing machine set or create a new one.
78+
79+
. Specify a GPU configuration under the `providerSpec` field in your machine set YAML file. See the following examples of valid configurations:
80+
+
81+
.Example configuration for the A2 machine series:
82+
[source,yaml]
83+
----
84+
providerSpec:
85+
value:
86+
machineType: a2-highgpu-1g <1>
87+
onHostMaintenance: Terminate <2>
88+
restartPolicy: Always <3>
89+
----
90+
<1> Specify the machine type. Ensure that the machine type is included in the the A2 machine series.
91+
<2> When using GPU support, you must set `onHostMaintenance` to `Terminate`.
92+
<3> Specify the restart policy for machines deployed by the machine set. Allowed values are `Always` or `Never`.
93+
+
94+
.Example configuration for the N1 machine series:
95+
[source,yaml]
96+
----
97+
providerSpec:
98+
value:
99+
gpus:
100+
- count: 1 <1>
101+
type: nvidia-tesla-p100 <2>
102+
machineType: n1-standard-1 <3>
103+
onHostMaintenance: Terminate <4>
104+
restartPolicy: Always <5>
105+
----
106+
<1> Specify the number of GPUs to attach to the machine.
107+
<2> Specify the type of GPUs to attach to the machine. Ensure that the machine type and GPU type are compatible.
108+
<3> Specify the machine type. Ensure that the machine type and GPU type are compatible.
109+
<4> When using GPU support, you must set `onHostMaintenance` to `Terminate`.
110+
<5> Specify the restart policy for machines deployed by the machine set. Allowed values are `Always` or `Never`.

0 commit comments

Comments
 (0)