You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GPU-accelerated workloads on an Azure Stack Edge Pro GPU device require a GPU virtual machine. This article provides an overview of GPU VMs, including supported OSs, GPU drivers, and VM sizes. Deployment options for GPU VMs used with Kubernetes clusters also are discussed.
19
+
GPU-accelerated workloads on an Azure Stack Edge Pro GPU device require a GPU VM (virtual machine). This article provides an overview of GPU VMs, including supported OSs, GPU drivers, and VM sizes. Deployment options for GPU VMs used with Kubernetes clusters also are discussed.
20
20
21
21
## About GPU VMs
22
22
@@ -26,15 +26,15 @@ To take advantage of the GPU capabilities of Azure N-series VMs, Nvidia GPU driv
26
26
27
27
You can [install and manage the extension using the Azure Resource Manager templates](azure-stack-edge-gpu-deploy-virtual-machine-install-gpu-extension.md) after VM deployment. In the Azure portal, you can install the GPU extension during or after you deploy a VM; for instructions, see [Deploy GPU VMs on your Azure Stack Edge device](azure-stack-edge-gpu-deploy-gpu-virtual-machine.md).
28
28
29
-
If your device will have a Kubernetes cluster configured, be sure to review [deployment considerations for Kubernetes clusters](#gpu-vms-and-kubernetes) before you deploy GPU VMs.
29
+
If your device has a Kubernetes cluster configured, be sure to review [deployment considerations for Kubernetes clusters](#gpu-vms-and-kubernetes) before you deploy GPU VMs.
30
30
31
31
## Supported OS and GPU drivers
32
32
33
33
The Nvidia GPU driver extensions for Windows and Linux support the following OS versions.
34
34
35
35
### Supported OS for GPU extension for Windows
36
36
37
-
This extension supports the following operating systems (OSs). Other versions may work but have not been tested in-house on GPU VMs running on Azure Stack Edge devices.
37
+
This extension supports the following operating systems (OSs). Other versions may work but haven't been tested in-house on GPU VMs running on Azure Stack Edge devices.
38
38
39
39
| Distribution | Version |
40
40
|---|---|
@@ -43,13 +43,15 @@ This extension supports the following operating systems (OSs). Other versions ma
43
43
44
44
### Supported OS for GPU extension for Linux
45
45
46
-
This extension supports the following OS distros, depending on the driver support for specific OS version. Other versions may work but have not been tested in-house on GPU VMs running on Azure Stack Edge devices.
46
+
This extension supports the following OS distro, depending on the driver support for specific OS version. Other versions may work but haven't been tested in-house on GPU VMs running on Azure Stack Edge devices.
47
47
48
48
| Distribution | Version |
49
49
|---|---|
50
-
| Ubuntu | 18.04 LTS |
51
50
| Red Hat Enterprise Linux | 7.4 |
52
51
52
+
> [!NOTE]
53
+
> Ubuntu 18.04 LTS GPU extension has been deprecated. The GPU extension is no longer supported on Ubuntu 18.04 GPU VMs running on Azure Stack Edge devices. If you plan to utilize the Ubuntu version 18.04 LTS distro, see steps for manual GPU driver installation at [CUDA Toolkit 12.1 Update 1 Downloads](https://developer.nvidia.com/cuda-12-1-1-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=18.04&target_type=deb_local). You may need to download the CUDA signing key before the installation. For an example of installing the signing key, see [Troubleshoot GPU extension issues for GPU VMs on Azure Stack Edge Pro GPU](azure-stack-edge-gpu-troubleshoot-virtual-machine-gpu-extension-installation.md#in-versions-lower-than-2205-linux-gpu-extension-installs-old-signing-keys-signature-andor-required-key-missing).
54
+
53
55
## GPU VM deployment
54
56
55
57
You can deploy a GPU VM via the Azure portal or using Azure Resource Manager templates. The GPU extension is installed after VM creation.<!--Wording still needs work!-->
@@ -65,17 +67,17 @@ Before you deploy GPU VMs on your device, review the following considerations if
65
67
66
68
#### For 1-GPU device:
67
69
68
-
-**Create a GPU VM followed by Kubernetes configuration on your device**: In this scenario, the GPU VM creation and Kubernetes configuration will both be successful. Kubernetes will not have access to the GPU in this case.
70
+
-**Create a GPU VM followed by Kubernetes configuration on your device**: In this scenario, the GPU VM creation and Kubernetes configuration will both be successful. Kubernetes won't have access to the GPU in this case.
69
71
70
-
-**Configure Kubernetes on your device followed by creation of a GPU VM**: In this scenario, the Kubernetes will claim the GPU on your device and the VM creation will fail as there are no GPU resources available.
72
+
-**Configure Kubernetes on your device followed by creation of a GPU VM**: In this scenario, the Kubernetes claims the GPU on your device and the VM creation will fail as there are no GPU resources available.
71
73
72
74
#### For 2-GPU device
73
75
74
76
-**Create a GPU VM followed by Kubernetes configuration on your device**: In this scenario, the GPU VM that you create will claim one GPU on your device and Kubernetes configuration will also be successful and claim the remaining one GPU.
75
77
76
-
-**Create two GPU VMs followed by Kubernetes configuration on your device**: In this scenario, the two GPU VMs will claim the two GPUs on the device and the Kubernetes is configured successfully with no GPUs.
78
+
-**Create two GPU VMs followed by Kubernetes configuration on your device**: In this scenario, the two GPU VMs claim the two GPUs on the device and the Kubernetes is configured successfully with no GPUs.
77
79
78
-
-**Configure Kubernetes on your device followed by creation of a GPU VM**: In this scenario, the Kubernetes will claim both the GPUs on your device and the VM creation will fail as no GPU resources are available.
80
+
-**Configure Kubernetes on your device followed by creation of a GPU VM**: In this scenario, the Kubernetes claims both the GPUs on your device and the VM creation will fail as no GPU resources are available.
79
81
80
82
<!--Li indicated that this is fixed. If you have GPU VMs running on your device and Kubernetes is also configured, then anytime the VM is deallocated (when you stop or remove a VM using Stop-AzureRmVM or Remove-AzureRmVM), there is a risk that the Kubernetes cluster will claim all the GPUs available on the device. In such an instance, you will not be able to restart the GPU VMs deployed on your device or create GPU VMs. -->
0 commit comments