Skip to content

Commit 7824e4b

Browse files
committed
Incorp last feedback
1 parent 1e6f33c commit 7824e4b

File tree

1 file changed

+13
-14
lines changed

1 file changed

+13
-14
lines changed

AKS-Arc/deploy-ai-model.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
---
2-
title: Deploy an AI model on AKS Arc with the Kubernetes AI toolchain operator (preview)
3-
description: Learn how to deploy an AI model on AKS Arc with the Kubernetes AI toolchain operator (KAITO).
2+
title: Deploy an AI model on AKS enabled by Azure Arc with the Kubernetes AI toolchain operator (preview)
3+
description: Learn how to deploy an AI model on AKS enabled by Azure Arc with the Kubernetes AI toolchain operator (KAITO).
44
author: sethmanheim
55
ms.author: sethm
66
ms.topic: how-to
7-
ms.date: 05/19/2025
7+
ms.date: 05/20/2025
88
ms.reviewer: haojiehang
9-
ms.lastreviewed: 05/14/2025
9+
ms.lastreviewed: 05/20/2025
1010

1111
---
1212

13-
# Deploy an AI model on AKS Arc with the Kubernetes AI toolchain operator (preview)
13+
# Deploy an AI model on AKS enabled by Azure Arc with the Kubernetes AI toolchain operator (preview)
1414

1515
[!INCLUDE [hci-applies-to-23h2](includes/hci-applies-to-23h2.md)]
1616

17-
This article describes how to deploy an AI model on AKS Arc with the *Kubernetes AI toolchain operator* (KAITO). The AI toolchain operator runs as a cluster extension in AKS Arc and makes it easier to deploy and run open source LLM models on your AKS Arc cluster. To enable this feature, follow this workflow:
17+
This article describes how to deploy an AI model on AKS enabled by Azure Arc with the *Kubernetes AI toolchain operator* (KAITO). The AI toolchain operator runs as a cluster extension in AKS enabled by Azure Arc and makes it easier to deploy and run open source LLM models on your AKS enabled by Azure Arc cluster. To enable this feature, follow this workflow:
1818

1919
1. Create a cluster with KAITO.
2020
1. Add a GPU node pool.
@@ -24,15 +24,15 @@ This article describes how to deploy an AI model on AKS Arc with the *Kubernetes
2424
1. Troubleshoot as needed.
2525

2626
> [!IMPORTANT]
27-
> The KAITO Extension for AKS on Azure Local is currently in PREVIEW.
27+
> The KAITO Extension for AKS enabled by Azure Arc on Azure Local is currently in PREVIEW.
2828
> See the [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
2929
3030
## Prerequisites
3131

3232
Before you begin, make sure you have the following prerequisites:
3333

3434
- Make sure the Azure Local cluster has a supported GPU, such as A2, A16, or T4.
35-
- Make sure the AKS Arc cluster can deploy GPU node pools with the corresponding GPU VM SKU. For more information, see [use GPU for compute-intensive workloads](deploy-gpu-node-pool.md).
35+
- Make sure the AKS enabled by Azure Arc cluster can deploy GPU node pools with the corresponding GPU VM SKU. For more information, see [use GPU for compute-intensive workloads](deploy-gpu-node-pool.md).
3636
- Make sure that **kubectl** is installed on your local machine. If you need to install **kubectl**, see [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/).
3737
- Install the **aksarc** extension, and make sure the version is at least 1.5.37. To get the list of installed CLI extensions, run `az extension list -o table`.
3838
- If you use a Powershell terminal, make sure the version is at least 7.4.
@@ -43,7 +43,7 @@ The AI toolchain operator extension currently supports KAITO version 0.4.5. Make
4343

4444
## Create a cluster with KAITO
4545

46-
To create an AKS Arc cluster on Azure Local with KAITO, follow these steps:
46+
To create an AKS enabled by Azure Arc cluster on Azure Local with KAITO, follow these steps:
4747

4848
1. Gather [all required parameters](aks-create-clusters-cli.md) and include the `--enable-ai-toolchain-operator` parameter to enable KAITO as part of the cluster creation.
4949

@@ -55,7 +55,7 @@ To create an AKS Arc cluster on Azure Local with KAITO, follow these steps:
5555

5656
## Update an existing cluster with KAITO
5757

58-
If you want to enable KAITO on an existing AKS Arc cluster with a GPU, you can run the following command to install the KAITO operator on the existing node pool:
58+
If you want to enable KAITO on an existing AKS enabled by Azure Arc cluster with a GPU, you can run the following command to install the KAITO operator on the existing node pool:
5959

6060
```azurecli
6161
az aksarc update --resource-group <Resource_Group_name> --name <Cluster_Name> --enable-ai-toolchain-operator
@@ -67,7 +67,7 @@ az aksarc update --resource-group <Resource_Group_name> --name <Cluster_Name> --
6767

6868
### [Azure portal](#tab/portal)
6969

70-
Sign in to the Azure portal and find your AKS Arc cluster. Under **Settings > Node pools**, select **Add**. Fill in the other required fields, then create the node pool.
70+
Sign in to the Azure portal and find your AKS enabled by Azure Arc cluster. Under **Settings > Node pools**, select **Add**. Fill in the other required fields, then create the node pool.
7171

7272
:::image type="content" source="media/deploy-ai-model/add-gpu-node-pool.png" alt-text="Screenshot of portal showing add GPU node pool." lightbox="media/deploy-ai-model/add-gpu-node-pool.png":::
7373

@@ -198,16 +198,15 @@ The following table shows the supported GPU models and their corresponding VM SK
198198
| phi-3-mini-128k-instruct | N | Y | Y |
199199
| phi-3.5-mini-instruct | N | Y | Y |
200200
| phi-4-mini-instruct | N | N | Y |
201-
| deepseek-r1-distill-llama-8b | N | N | Y |
202201
| mistral-7b/mistral-7b-instruct | N | N | Y |
203202
| qwen2.5-coder-7b-instruct | N | N | Y |
204203

205204
## Troubleshooting
206205

207206
1. If you want to deploy an LLM and see the error **OutOfMemoryError: CUDA out of memory**, please raise an issue in the [KAITO repo](https://github.com/kaito-project/kaito/).
208-
1. If you see the error **(ExtensionOperationFailed) The extension operation failed with the following error: Unable to get a response from the Agent in time** during extension installation, [see this TSG](/troubleshoot/azure/azure-kubernetes/extensions/cluster-extension-deployment-errors#error-unable-to-get-a-response-from-the-agent-in-time) and ensure the extension agent in the AKS Arc cluster can connect to Azure.
207+
1. If you see the error **(ExtensionOperationFailed) The extension operation failed with the following error: Unable to get a response from the Agent in time** during extension installation, [see this TSG](/troubleshoot/azure/azure-kubernetes/extensions/cluster-extension-deployment-errors#error-unable-to-get-a-response-from-the-agent-in-time) and ensure the extension agent in the AKS enabled by Azure Arc cluster can connect to Azure.
209208
1. If you see an error during prompt testing such as **{"detail":[{"type":"json_invalid","loc":["body",1],"msg":"JSON decode error","input":{},"ctx":{"error":"Expecting property name enclosed in double quotes"}}]}**, it's possible that your PowerShell terminal version is 5.1. Make sure the terminal version is at least 7.4.
210209

211210
## Next steps
212211

213-
In this article, you learned how to deploy an AI model on AKS Arc with the Kubernetes AI toolchain operator (KAITO). For more information about the KAITO project, see the [KAITO GitHub repo](https://github.com/kaito-project/kaito).
212+
In this article, you learned how to deploy an AI model on AKS enabled by Azure Arc with the Kubernetes AI toolchain operator (KAITO). For more information about the KAITO project, see the [KAITO GitHub repo](https://github.com/kaito-project/kaito).

0 commit comments

Comments
 (0)