Skip to content

Commit 14e08c3

Browse files
committed
Incorporated final feedback
1 parent 9ced3e1 commit 14e08c3

File tree

1 file changed

+9
-8
lines changed

1 file changed

+9
-8
lines changed

articles/aks/gpu-cluster.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,12 @@ To view supported GPU-enabled VMs, see [GPU-optimized VM sizes in Azure][gpu-sku
2525
* If you're using an Azure Linux GPU-enabled node pool, automatic security patches aren't applied, and the default behavior for the cluster is *Unmanaged*. For more information, see [auto-upgrade](./auto-upgrade-node-image.md).
2626
* [NVadsA10](../virtual-machines/nva10v5-series.md) v5-series are *not* a recommended SKU for GPU VHD.
2727
* AKS doesn't support Windows GPU-enabled node pools.
28+
* Updating an existing node pool to add GPU isn't supported.
2829

2930
## Before you begin
3031

3132
* This article assumes you have an existing AKS cluster. If you don't have a cluster, create one using the [Azure CLI][aks-quickstart-cli], [Azure PowerShell][aks-quickstart-powershell], or the [Azure portal][aks-quickstart-portal].
32-
* You need the Azure CLI version 1.0.0b2 or later installed and configured. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI][install-azure-cli].
33+
* You need the Azure CLI version 2.0.64 or later installed and configured. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI][install-azure-cli].
3334

3435
## Get the credentials for your cluster
3536

@@ -67,7 +68,7 @@ AKS has automatic GPU driver installation enabled by default. In some cases, suc
6768
--cluster-name myAKSCluster \
6869
--name gpunp \
6970
--node-count 1 \
70-
--skip-gpu-install\
71+
--skip-gpu-driver-install \
7172
--node-vm-size Standard_NC6s_v3 \
7273
--node-taints sku=gpu:NoSchedule \
7374
--enable-cluster-autoscaler \
@@ -113,7 +114,7 @@ To use the default OS SKU, you create the node pool without specifying an OS SKU
113114
* `--max-count`: Configures the cluster autoscaler to maintain a maximum of three nodes in the node pool.
114115
115116
> [!NOTE]
116-
> Taints and VM sizes can only be set for node pools during node pool creation, but you can update autoscaler settings at any time.
117+
> Taints and VM sizes can only be set for node pools during node pool creation, but you can update autoscaler settings at any time.
117118
118119
##### [Azure Linux node pool](#tab/add-azure-linux-gpu-node-pool)
119120
@@ -144,17 +145,17 @@ To use Azure Linux, you specify the OS SKU by setting `os-sku` to `AzureLinux` d
144145
* `--max-count`: Configures the cluster autoscaler to maintain a maximum of three nodes in the node pool.
145146
146147
> [!NOTE]
147-
> Taints and VM sizes can only be set for node pools during node pool creation, but you can update autoscaler settings at any time.
148+
> Taints and VM sizes can only be set for node pools during node pool creation, but you can update autoscaler settings at any time. Certain SKUs, including A100 and H100 VM SKUs, aren't available for Azure Linux. For more information, see [GPU-optimized VM sizes in Azure][gpu-skus].
148149
149150
---
150151
151-
2. Create a namespace using the [`kubectl create namespace`][kubectl-create] command.
152+
1. Create a namespace using the [`kubectl create namespace`][kubectl-create] command.
152153
153154
```bash
154155
kubectl create namespace gpu-resources
155156
```
156157
157-
3. Create a file named *nvidia-device-plugin-ds.yaml* and paste the following YAML manifest provided as part of the [NVIDIA device plugin for Kubernetes project][nvidia-github]:
158+
2. Create a file named *nvidia-device-plugin-ds.yaml* and paste the following YAML manifest provided as part of the [NVIDIA device plugin for Kubernetes project][nvidia-github]:
158159
159160
```yaml
160161
apiVersion: apps/v1
@@ -206,13 +207,13 @@ To use Azure Linux, you specify the OS SKU by setting `os-sku` to `AzureLinux` d
206207
path: /var/lib/kubelet/device-plugins
207208
```
208209
209-
4. Create the DaemonSet and confirm the NVIDIA device plugin is created successfully using the [`kubectl apply`][kubectl-apply] command.
210+
3. Create the DaemonSet and confirm the NVIDIA device plugin is created successfully using the [`kubectl apply`][kubectl-apply] command.
210211
211212
```bash
212213
kubectl apply -f nvidia-device-plugin-ds.yaml
213214
```
214215
215-
5. Now that you successfully installed the NVIDIA device plugin, you can check that your [GPUs are schedulable](#confirm-that-gpus-are-schedulable) and [run a GPU workload](#run-a-gpu-enabled-workload).
216+
4. Now that you successfully installed the NVIDIA device plugin, you can check that your [GPUs are schedulable](#confirm-that-gpus-are-schedulable) and [run a GPU workload](#run-a-gpu-enabled-workload).
216217
217218
### Use NVIDIA GPU Operator with AKS
218219

0 commit comments

Comments
 (0)