You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each GPU VM can only use GPUs from a single node, and GPU VMs are not automatically load balanced. For example, you have 4 nodes and 2 GPUs on each node, and you create 4 VMs with 1 GPU for each VM. Each VM can exist on a different node. If that happens, any single node only has 1 available GPU left. From the portal, you can see that there are 4 GPUs available. However, if you try to create a VM with 2 GPUs, it fails with insufficient GPU capacity, because no single node has 2 GPUs available. The solution is to create the VMs with 2 GPUs first.
27
+
26
28
## Upgrade GPUs or add to an existing node
27
29
28
30
The following section provides a high-level overview of the process to add a GPU.
@@ -36,9 +38,9 @@ The following section provides a high-level overview of the process to add a GPU
36
38
37
39
## Change GPU partition size
38
40
39
-
Azure Stack Hub supports GPU partitioning for the AMD MI25. With GPU partitioning, you can increase the density of virtual machines using a virtual GPU instance. You can change the partition size to meet specific workload requirements. By default, Azure Stack Hub uses the largest partition size (1/8) to provide the highest possible density with a 2 GB frame buffer. This is useful for workloads that require accelerated graphics applications and virtual desktops.
41
+
Azure Stack Hub supports GPU partitioning for the AMD MI25. With GPU partitioning, you can increase the density of virtual machines using a virtual GPU instance. You can change the partition size to meet specific workload requirements. By default, Azure Stack Hub uses the largest partition size (1/8) to provide the highest possible density with a 2 GB frame buffer. This partitioning is useful for workloads that require accelerated graphics applications and virtual desktops.
40
42
41
-
To change the partition size, do the following:
43
+
To change the partition size, perform the following steps:
42
44
43
45
1. Deallocate all VMs that are currently using a GPU.
44
46
1. Ensure that the [PowerShell Az module](powershell-install-az-module.md) for Azure Stack Hub is installed.
@@ -50,6 +52,7 @@ To change the partition size, do the following:
50
52
```powershell
51
53
Get-AzsScaleUnit # Returns a list of information about scale units in your stamp
52
54
```
55
+
53
56
Update the following `$partitionSize` and `$scaleUnitName` variables using the "**name**" value returned in the previous step, then run the following to update the scale unit partition size:
54
57
55
58
```powershell
@@ -67,6 +70,9 @@ To change the partition size, do the following:
67
70
| 2 | 1/2 of a physical GPU. |
68
71
| 1 | Entire physical GPU. |
69
72
73
+
> [!NOTE]
74
+
> Resizing GPU VMs is not supported.
75
+
70
76
## Next steps
71
77
72
78
-[Manage storage accounts in Azure Stack Hub](azure-stack-manage-storage-accounts.md).
Copy file name to clipboardExpand all lines: azure-stack/user/gpu-vms-about.md
+24-22Lines changed: 24 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,8 +5,8 @@ author: sethmanheim
5
5
ms.author: sethm
6
6
ms.service: azure-stack
7
7
ms.topic: reference
8
-
ms.date: 10/24/2024
9
-
ms.reviewer: unknown
8
+
ms.date: 04/21/2025
9
+
ms.reviewer: rtibi
10
10
ms.lastreviewed: 4/28/2021
11
11
12
12
# Intent: As a a developer on Azure Stack Hub, I want to use a machine with a Graphics Processing Unit (GPU) in order to deliver an processing intensive visualization application.
@@ -40,9 +40,9 @@ NCv3-series VMs are powered by NVIDIA Tesla V100 GPUs. Customers can take advant
40
40
41
41
## NVv4
42
42
43
-
The NVv4-series virtual machines are powered by AMD Radeon Instinct MI25 GPUs. With the NVv4-series, Azure Stack Hub introduces virtual machines with partial GPUs. This size can be used for GPU accelerated graphics applications and virtual desktops. NVv4 virtual machines currently support only the Windows guest operating system.
43
+
The NVv4-series virtual machines are powered by AMD Radeon Instinct MI25 GPUs. With the NVv4-series, Azure Stack Hub introduces virtual machines with partial GPUs. This size can be used for GPU accelerated graphics applications and virtual desktops. NVv4 virtual machines currently support only the Windows guest operating system.
44
44
45
-
| Size | vCPU | Memory: GiB | Temp storage (SSD) GiB | GPU | GPU memory: GiB | Max data disks | Max NICs |
45
+
| Size | vCPU | Memory: GiB | Temp storage (SSD) GiB | GPU | GPU memory: GiB | Max data disks | Max NICs |
@@ -80,7 +80,10 @@ The NC_A100 series VMs are powered by NVIDIA Ampere A100 GPUs, the successor of
80
80
- Number of GPUs per server supported (1, 2, 3, 4). Preferred are: 1, 2, and 4.
81
81
- All GPUs must be of the exact same SKU throughout the scale unit.
82
82
- All GPU quantities per server must be the same throughout the scale unit.
83
-
- GPU partition size (for AMD Mi25) needs to be the same throughout all GPU VMs on the scale unit.
83
+
- GPU partition size (for AMD Mi25) needs to be the same for all GPU VMs on the scale unit.
84
+
85
+
> [!NOTE]
86
+
> Resizing GPU VMs is not supported.
84
87
85
88
## Capacity planning
86
89
@@ -94,14 +97,14 @@ Azure Stack Hub now supports adding GPUs to any existing system. To add a GPU, r
94
97
95
98
GPU VMs undergo downtime during operations such as patch and update (PnU) and hardware replacement (FRU) of Azure Stack Hub. The following table covers the state of the VM as observed during these activities and the manual action you can do to make these VMs available after the operation.
| VM state | Unavailable during update. Can be made available with manual operation. VM is automatically online post update. | Unavailable during FRU. Can be made available with manual operation. VM needs to be brought back up after FRU|
| VM state | Unavailable during update. Can be made available with manual operation. VM is automatically online post update. | Unavailable during FRU. Can be made available with manual operation. VM needs to be brought back up after FRU|
100
103
| Manual operation | If the VM needs to be made available during the update, if there are available GPU partitions, the VM can be restarted from the portal by clicking the **Restart** button. VM automatically comes back up post update. | VM is not available during FRU. If there are available GPUs, VM may be stop-deallocated and restarted during FRU. Post FRU completion, the VM must be `stop-deallocated` using the **Stop** button, then restarted using the **Start** button.|
101
104
102
105
## Guest driver installation
103
106
104
-
The following PowerShell cmdlets can be used for driver installation:
107
+
You can use the [Set-AzVMExtension](/powershell/module/az.compute/set-azvmextension) PowerShell cmdlet for driver installation:
-Settings $Settings ` # If no settings are set, omit this parameter
123
-
-Verbose
118
+
Set-AzVMExtension -Location $Location `
119
+
-Publisher $driverPublisher `
120
+
-ExtensionType $driverType `
121
+
-TypeHandlerVersion $driverVersion `
122
+
-VMName $VmName `
123
+
-ResourceGroupName $ResourceGroupName `
124
+
-Name $driverName `
125
+
-Settings $Settings ` # If no settings are set, omit this parameter
126
+
-Verbose
124
127
```
125
128
126
-
Depending on the OS, type and connectivity of your Azure Stack Hub GPU VM, you must replace these values with the settings below.
129
+
Depending on the OS, type, and connectivity of your Azure Stack Hub GPU VM, you must replace these values with the following settings.
127
130
128
131
### AMD MI25
129
132
@@ -157,7 +160,7 @@ NVIDIA drivers must be installed inside the virtual machine for CUDA or GRID wor
157
160
158
161
#### Use case: graphics/visualization GRID
159
162
160
-
This scenario requires the use of GRID drivers. GRID drivers can be downloaded through the NVIDIA Application Hub provided you have the required licenses. The GRID drivers also require a GRID license server with appropriate GRID licenses before using the GRID drivers on the VM.
163
+
This scenario requires the use of GRID drivers. GRID drivers can be downloaded through the NVIDIA Application Hub provided you have the required licenses. The GRID drivers also require a GRID license server with appropriate GRID licenses before using the GRID drivers on the VM.
161
164
162
165
```powershell
163
166
$Settings = @{
@@ -172,8 +175,7 @@ CUDA drivers don't need a license server and don't need modified settings.
172
175
173
176
### Use case: compute/CUDA - Disconnected
174
177
175
-
Links to NVIDIA CUDA drivers can be obtained using the link:
0 commit comments