You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This article describes how to deploy an AI model on AKS Arc with the *Kubernetes AI toolchain operator* (KAITO). The AI toolchain operator runs as a cluster extension in AKS Arc and makes it easier to deploy and run open source LLM models on your AKS Arc cluster. To enable this feature, follow this workflow:
17
+
This article describes how to deploy an AI model on AKS enabled by Azure Arc with the *Kubernetes AI toolchain operator* (KAITO). The AI toolchain operator runs as a cluster extension in AKS enabled by Azure Arc and makes it easier to deploy and run open source LLM models on your AKS enabled by Azure Arc cluster. To enable this feature, follow this workflow:
18
18
19
19
1. Create a cluster with KAITO.
20
20
1. Add a GPU node pool.
@@ -24,15 +24,15 @@ This article describes how to deploy an AI model on AKS Arc with the *Kubernetes
24
24
1. Troubleshoot as needed.
25
25
26
26
> [!IMPORTANT]
27
-
> The KAITO Extension for AKS on Azure Local is currently in PREVIEW.
27
+
> The KAITO Extension for AKS enabled by Azure Arc on Azure Local is currently in PREVIEW.
28
28
> See the [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
29
29
30
30
## Prerequisites
31
31
32
32
Before you begin, make sure you have the following prerequisites:
33
33
34
34
- Make sure the Azure Local cluster has a supported GPU, such as A2, A16, or T4.
35
-
- Make sure the AKS Arc cluster can deploy GPU node pools with the corresponding GPU VM SKU. For more information, see [use GPU for compute-intensive workloads](deploy-gpu-node-pool.md).
35
+
- Make sure the AKS enabled by Azure Arc cluster can deploy GPU node pools with the corresponding GPU VM SKU. For more information, see [use GPU for compute-intensive workloads](deploy-gpu-node-pool.md).
36
36
- Make sure that **kubectl** is installed on your local machine. If you need to install **kubectl**, see [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/).
37
37
- Install the **aksarc** extension, and make sure the version is at least 1.5.37. To get the list of installed CLI extensions, run `az extension list -o table`.
38
38
- If you use a Powershell terminal, make sure the version is at least 7.4.
@@ -43,7 +43,7 @@ The AI toolchain operator extension currently supports KAITO version 0.4.5. Make
43
43
44
44
## Create a cluster with KAITO
45
45
46
-
To create an AKS Arc cluster on Azure Local with KAITO, follow these steps:
46
+
To create an AKS enabled by Azure Arc cluster on Azure Local with KAITO, follow these steps:
47
47
48
48
1. Gather [all required parameters](aks-create-clusters-cli.md) and include the `--enable-ai-toolchain-operator` parameter to enable KAITO as part of the cluster creation.
49
49
@@ -55,7 +55,7 @@ To create an AKS Arc cluster on Azure Local with KAITO, follow these steps:
55
55
56
56
## Update an existing cluster with KAITO
57
57
58
-
If you want to enable KAITO on an existing AKS Arc cluster with a GPU, you can run the following command to install the KAITO operator on the existing node pool:
58
+
If you want to enable KAITO on an existing AKS enabled by Azure Arc cluster with a GPU, you can run the following command to install the KAITO operator on the existing node pool:
59
59
60
60
```azurecli
61
61
az aksarc update --resource-group <Resource_Group_name> --name <Cluster_Name> --enable-ai-toolchain-operator
Sign in to the Azure portal and find your AKS Arc cluster. Under **Settings > Node pools**, select **Add**. Fill in the other required fields, then create the node pool.
70
+
Sign in to the Azure portal and find your AKS enabled by Azure Arc cluster. Under **Settings > Node pools**, select **Add**. Fill in the other required fields, then create the node pool.
@@ -198,16 +198,15 @@ The following table shows the supported GPU models and their corresponding VM SK
198
198
| phi-3-mini-128k-instruct | N | Y | Y |
199
199
| phi-3.5-mini-instruct | N | Y | Y |
200
200
| phi-4-mini-instruct | N | N | Y |
201
-
| deepseek-r1-distill-llama-8b | N | N | Y |
202
201
| mistral-7b/mistral-7b-instruct | N | N | Y |
203
202
| qwen2.5-coder-7b-instruct | N | N | Y |
204
203
205
204
## Troubleshooting
206
205
207
206
1. If you want to deploy an LLM and see the error **OutOfMemoryError: CUDA out of memory**, please raise an issue in the [KAITO repo](https://github.com/kaito-project/kaito/).
208
-
1. If you see the error **(ExtensionOperationFailed) The extension operation failed with the following error: Unable to get a response from the Agent in time** during extension installation, [see this TSG](/troubleshoot/azure/azure-kubernetes/extensions/cluster-extension-deployment-errors#error-unable-to-get-a-response-from-the-agent-in-time) and ensure the extension agent in the AKS Arc cluster can connect to Azure.
207
+
1. If you see the error **(ExtensionOperationFailed) The extension operation failed with the following error: Unable to get a response from the Agent in time** during extension installation, [see this TSG](/troubleshoot/azure/azure-kubernetes/extensions/cluster-extension-deployment-errors#error-unable-to-get-a-response-from-the-agent-in-time) and ensure the extension agent in the AKS enabled by Azure Arc cluster can connect to Azure.
209
208
1. If you see an error during prompt testing such as **{"detail":[{"type":"json_invalid","loc":["body",1],"msg":"JSON decode error","input":{},"ctx":{"error":"Expecting property name enclosed in double quotes"}}]}**, it's possible that your PowerShell terminal version is 5.1. Make sure the terminal version is at least 7.4.
210
209
211
210
## Next steps
212
211
213
-
In this article, you learned how to deploy an AI model on AKS Arc with the Kubernetes AI toolchain operator (KAITO). For more information about the KAITO project, see the [KAITO GitHub repo](https://github.com/kaito-project/kaito).
212
+
In this article, you learned how to deploy an AI model on AKS enabled by Azure Arc with the Kubernetes AI toolchain operator (KAITO). For more information about the KAITO project, see the [KAITO GitHub repo](https://github.com/kaito-project/kaito).
0 commit comments