Skip to content

Commit 1e55d65

Browse files
authored
Merge pull request #233729 from GitHubber17/refresh-79519-nvidia-gpu-ext
Freshness Pass for User Story: 79519
2 parents 21ba8c4 + 3fa436b commit 1e55d65

File tree

6 files changed

+88
-61
lines changed

6 files changed

+88
-61
lines changed
Lines changed: 88 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,50 @@
11
---
22
title: NVIDIA GPU Driver Extension - Azure Windows VMs
3-
description: Azure extension for installing NVIDIA GPU drivers on N-series compute VMs running Windows.
3+
description: Learn how to install the NVIDIA GPU Driver Extension on N-series virtual machines running Windows from the Azure portal or by using an ARM template.
44
services: virtual-machines
55
ms.service: virtual-machines
66
ms.subservice: hpc
77
ms.collection: windows
88
ms.topic: article
99
ms.tgt_pltfrm: vm-windows
1010
ms.workload: infrastructure-services
11-
ms.date: 10/14/2021
11+
ms.date: 04/06/2023
1212
ms.author: mamccrea
1313
author: mamccrea
1414
---
15+
1516
# NVIDIA GPU Driver Extension for Windows
1617

17-
This extension installs NVIDIA GPU drivers on Windows N-series virtual machines (VMs). Depending on the VM family, the extension installs CUDA or GRID drivers. When you install NVIDIA drivers by using this extension, you're accepting and agreeing to the terms of the [NVIDIA End-User License Agreement](https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/). During the installation process, the VM might reboot to complete the driver setup.
18+
The NVIDIA GPU Driver Extension for Windows installs NVIDIA GPU drivers on Windows N-series virtual machines (VMs). Depending on the VM family, the extension installs CUDA or GRID drivers. When you install NVIDIA drivers by using this extension, you accept and agree to the terms of the [NVIDIA End-User License Agreement](https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/). During the installation process, the VM might reboot to complete the driver setup.
19+
20+
The instructions for manual installation of the drivers, and the list of current supported versions are available for review. For more information, see [Install NVIDIA GPU drivers on N-series VMs running Windows](/azure/virtual-machines/windows/n-series-driver-setup).
1821

19-
Instructions on manual installation of the drivers and the current supported versions are available. For more information, see [Azure N-series NVIDIA GPU driver setup for Windows](../windows/n-series-driver-setup.md).
20-
An extension is also available to install NVIDIA GPU drivers on [Linux N-series VMs](hpccompute-gpu-linux.md).
22+
The NVIDIA GPU Driver Extension can also be deployed on Linux N-series VMs. For more information, see [NVIDIA GPU Driver Extension for Linux](hpccompute-gpu-linux.md).
2123

2224
## Prerequisites
2325

24-
### Operating system
26+
Confirm your virtual machine satisfies the prerequisites for using the NVIDIA GPU Driver Extension.
2527

26-
This extension supports the following OSs:
28+
### Operating system support
29+
30+
The NVIDIA GPU Driver Extension supports the following Windows versions:
2731

2832
| Distribution | Version |
29-
|---|---|
33+
| --- | --- |
34+
| Windows 11 | Core |
3035
| Windows 10 | Core |
36+
| Windows Server 2022 | Core |
3137
| Windows Server 2019 | Core |
3238
| Windows Server 2016 | Core |
3339
| Windows Server 2012 R2 | Core |
3440

35-
### Internet connectivity
41+
### Internet connection required
3642

37-
The Microsoft Azure Extension for NVIDIA GPU Drivers requires that the target VM is connected to the internet and has access.
43+
The NVIDIA GPU Driver Extension requires that the target VM is connected to the internet and has access.
3844

39-
## Extension schema
45+
## Review the extension schema
4046

41-
The following JSON shows the schema for the extension:
47+
The following JSON snippet shows the schema for the extension:
4248

4349
```json
4450
{
@@ -62,59 +68,66 @@ The following JSON shows the schema for the extension:
6268

6369
### Properties
6470

71+
The JSON schema includes values for the following parameters.
72+
6573
| Name | Value/Example | Data type |
66-
| ---- | ---- | ---- |
67-
| apiVersion | 2015-06-15 | date |
68-
| publisher | Microsoft.HpcCompute | string |
69-
| type | NvidiaGpuDriverWindows | string |
70-
| typeHandlerVersion | 1.4 | int |
74+
| --- | --- | --- |
75+
| `apiVersion` | 2015-06-15 | date |
76+
| `publisher` | Microsoft.HpcCompute | string |
77+
| `type` | NvidiaGpuDriverWindows | string |
78+
| `typeHandlerVersion` | 1.4 | int |
7179

72-
## Deployment
80+
## Deploy the extension
81+
82+
Azure VM extensions can be managed by using the Azure CLI, PowerShell, Azure Resource Manager (ARM) templates, and the Azure portal.
83+
84+
> [!Note]
85+
> Some of the following examples use `<placeholder>` parameter values in the commands. Before you run each command, make sure to replace any placeholder values with specific values for your configuration.
7386
7487
### Azure portal
7588

76-
You can deploy Azure NVIDIA VM extensions in the Azure portal.
89+
To install the NVIDIA GPU Driver Extension in the Azure portal, follow these steps:
7790

78-
1. In a browser, go to the [Azure portal](https://portal.azure.com).
91+
1. In the [Azure portal](https://portal.azure.com), go to the virtual machine on which you want to install the extension.
7992

80-
1. Go to the virtual machine on which you want to install the driver.
93+
1. Under **Settings**, select **Extensions + Applications**.
8194

82-
1. On the left menu, select **Extensions**.
95+
:::image type="content" source="./media/nvidia-ext-portal/extensions-menu.png" alt-text="Screenshot that shows how to select Extensions + Applications for a virtual machine in the Azure portal." border="false":::
8396

84-
:::image type="content" source="./media/nvidia-ext-portal/extensions-menu.png" alt-text="Screenshot that shows selecting Extensions in the Azure portal menu.":::
97+
1. Under **Extensions**, select **+ Add**.
8598

86-
1. Select **Add**.
99+
:::image type="content" source="./media/nvidia-ext-portal/add-extension.png" alt-text="Screenshot that shows how to add an extension for a virtual machine in the Azure portal." border="false":::
87100

88-
:::image type="content" source="./media/nvidia-ext-portal/add-extension.png" alt-text="Screenshot that shows adding a V M extension for the selected V M.":::
101+
1. Locate and select **NVIDIA GPU Driver Extension**, then select **Next**.
89102

90-
1. Scroll to find and select **NVIDIA GPU Driver Extension**, and then select **Next**.
103+
:::image type="content" source="./media/nvidia-ext-portal/select-nvidia-extension.png" alt-text="Screenshot that shows how to locate and select the NVIDIA GPU Driver Extension for a virtual machine in the Azure portal." border="false":::
91104

92-
:::image type="content" source="./media/nvidia-ext-portal/select-nvidia-extension.png" alt-text="Screenshot that shows selecting NVIDIA G P U Driver Extension.":::
105+
1. Select **Review + create**. Confirm the deployment action, and select **Create**.
93106

94-
1. Select **Review + create**, and select **Create**. Wait a few minutes for the driver to deploy.
107+
Wait a few minutes for the extension to deploy.
95108

96-
:::image type="content" source="./media/nvidia-ext-portal/create-nvidia-extension.png" alt-text="Screenshot that shows selecting the Review + create button.":::
109+
:::image type="content" source="./media/nvidia-ext-portal/create-nvidia-extension.png" alt-text="Screenshot that shows how to create the NVIDIA GPU Driver Extension on the selected virtual machine in the Azure portal." border="false":::
97110

98-
1. Verify that the extension was added to the list of installed extensions.
111+
1. Confirm the extension is listed as an installed extension for the virtual machine.
99112

100-
:::image type="content" source="./media/nvidia-ext-portal/verify-extension.png" alt-text="Screenshot that shows the new extension in the list of extensions for the V M.":::
113+
:::image type="content" source="./media/nvidia-ext-portal/verify-extension.png" alt-text="Screenshot that shows the NVIDIA GPU Driver Extension in the list of extensions for the virtual machine in the Azure portal." border="false":::
101114

102-
### Azure Resource Manager template
115+
### ARM template
103116

104-
You can use Azure Resource Manager templates to deploy Azure VM extensions. Templates are ideal when you deploy one or more virtual machines that require post-deployment configuration.
117+
ARM templates are ideal when you deploy one or more virtual machines that require post-deployment configuration.
105118

106-
The JSON configuration for a virtual machine extension can be nested inside the virtual machine resource or placed at the root or top level of a Resource Manager JSON template. The placement of the JSON configuration affects the value of the resource name and type. For more information, see [Set name and type for child resources](../../azure-resource-manager/templates/child-resource-name-type.md).
119+
The JSON configuration for a virtual machine extension can be nested inside the virtual machine resource or placed at the root or top level of a JSON ARM template. The placement of the JSON configuration affects the value of the resource `name` and `type`. For more information, see [Set name and type for child resources](/azure/azure-resource-manager/templates/child-resource-name-type).
107120

108121
The following example assumes the extension is nested inside the virtual machine resource. When the extension resource is nested, the JSON is placed in the `"resources": []` object of the virtual machine.
109122

110123
```json
111124
{
112-
"name": "myExtensionName",
125+
"name": "<myExtensionName>",
113126
"type": "extensions",
114-
"location": "[resourceGroup().location]",
127+
"location": "[<resourceGroup().location>]",
115128
"apiVersion": "2015-06-15",
116129
"dependsOn": [
117-
"[concat('Microsoft.Compute/virtualMachines/', myVM)]"
130+
"[concat('Microsoft.Compute/virtualMachines/', <myVM>)]"
118131
],
119132
"properties": {
120133
"publisher": "Microsoft.HpcCompute",
@@ -129,11 +142,13 @@ The following example assumes the extension is nested inside the virtual machine
129142

130143
### PowerShell
131144

145+
Use the following PowerShell command to deploy the NVIDIA GPU Driver Extension to a virtual machine.
146+
132147
```powershell
133148
Set-AzVMExtension
134-
-ResourceGroupName "myResourceGroup" `
135-
-VMName "myVM" `
136-
-Location "southcentralus" `
149+
-ResourceGroupName "<myResourceGroup>" `
150+
-VMName "<myVM>" `
151+
-Location "<location>" `
137152
-Publisher "Microsoft.HpcCompute" `
138153
-ExtensionName "NvidiaGpuDriverWindows" `
139154
-ExtensionType "NvidiaGpuDriverWindows" `
@@ -144,53 +159,65 @@ Set-AzVMExtension
144159

145160
### Azure CLI
146161

162+
Run the following command in the Azure CLI to deploy the NVIDIA GPU Driver Extension to a virtual machine.
163+
147164
```azurecli
148165
az vm extension set \
149-
--resource-group myResourceGroup \
150-
--vm-name myVM \
166+
--resource-group <myResourceGroup> \
167+
--vm-name <myVM> \
151168
--name NvidiaGpuDriverWindows \
152169
--publisher Microsoft.HpcCompute \
153170
--version 1.4 \
154171
--settings '{ \
155172
}'
156173
```
157174

158-
## Troubleshoot and support
175+
## <a name="troubleshoot-and-support"></a> Troubleshoot issues
176+
177+
Here are some suggestions for how to troubleshoot deployment issues.
178+
179+
### Check extension status
159180

160-
### Troubleshoot
181+
Check the status of your extension deployment in the Azure portal, or by using PowerShell or the Azure CLI.
161182

162-
You can retrieve data about the state of extension deployments from the Azure portal and by using Azure PowerShell and the Azure CLI. To see the deployment state of extensions for a given VM, run the following command:
183+
To see the deployment state of extensions for a given VM, run the following commands:
163184

164185
```powershell
165-
Get-AzVMExtension -ResourceGroupName myResourceGroup -VMName myVM -Name myExtensionName
186+
Get-AzVMExtension -ResourceGroupName <myResourceGroup> -VMName <myVM> -Name <myExtensionName>
166187
```
167188

168189
```azurecli
169-
az vm extension list --resource-group myResourceGroup --vm-name myVM -o table
190+
az vm extension list --resource-group <myResourceGroup> --vm-name <myVM> -o table
170191
```
192+
### Review output logs
171193

172-
Extension execution output is logged to the following directory:
194+
View output logs for the NVIDIA GPU Driver Extension deployment under
195+
`C:\WindowsAzure\Logs\Plugins\Microsoft.HpcCompute.NvidiaGpuDriverWindows\`.
173196

174-
```cmd
175-
C:\WindowsAzure\Logs\Plugins\Microsoft.HpcCompute.NvidiaGpuDriverWindows\
176-
```
197+
### Respond to error codes
177198

178-
### Error codes
199+
The following table lists common error codes for deployment and potential follow-up actions.
179200

180-
| Error Code | Meaning | Possible action |
201+
| Error | Description | Action |
181202
| :---: | --- | --- |
182-
| 0 | Operation successful. |
183-
| 1 | Operation successful. Reboot required. |
184-
| 100 | Operation not supported or couldn't be completed. | Possible causes are that the PowerShell version isn't supported, the VM size isn't an N-series VM, or a failure occurred in downloading data. Check the log files to determine the cause of the error. |
203+
| 0 | Operation successful. | No required action. |
204+
| 1 | Operation successful. | Reboot. |
205+
| 100 | Operation not supported or couldn't be completed. | Check log files to determine cause of error, such as: <br>- PowerShell version isn't supported. <br> - VM size isn't an N-series VM. <br> - Failure during data download. |
185206
| 240, 840 | Operation timeout. | Retry operation. |
186-
| -1 | Exception occurred. | Check the log files to determine the cause of the exception. |
187-
| -5x | Operation interrupted due to pending reboot. | Reboot VM. Installation continues after the reboot. Uninstall should be invoked manually. |
207+
| -1 | Exception occurred. | Check log files to determine cause of exception. |
208+
| -5x | Operation interrupted due to pending reboot. | Reboot the VM. Installation continues after reboot. <br> Uninstall should be invoked manually. |
209+
210+
### Get support
211+
212+
Here are some other options to help you resolve deployment issues:
213+
214+
- For assistance, contact the Azure experts on the [Q&A and Stack Overflow forums](https://azure.microsoft.com/support/community/).
188215

189-
### Support
216+
- If you don't find an answer on the site, you can post a question for input from Microsoft or other members of the community.
190217

191-
If you need more help at any point in this article, contact the Azure experts on the [MSDN Azure and Stack Overflow forums](https://azure.microsoft.com/support/community/). Alternatively, you can file an Azure support incident. Go to [Azure support](https://azure.microsoft.com/support/options/) and select **Get support**. For information about using Azure support, read the [Azure support FAQ](https://azure.microsoft.com/support/faq/).
218+
- You can also [Contact Microsoft Support](https://support.microsoft.com/contactus/). For information about using Azure support, read the [Azure support FAQ](https://azure.microsoft.com/support/legal/faq/).
192219

193220
## Next steps
194221

195222
- For more information about extensions, see [Virtual machine extensions and features for Windows](features-windows.md).
196-
- For more information about N-series VMs, see [GPU optimized virtual machine sizes](../sizes-gpu.md).
223+
- For more information about N-series VMs, see [GPU optimized virtual machine sizes](/azure/virtual-machines/sizes-gpu).
-24.9 KB
Loading
-46.6 KB
Loading
-26.6 KB
Loading
-61.3 KB
Loading
-35.8 KB
Loading

0 commit comments

Comments
 (0)