Skip to content

Commit 3553031

Browse files
authored
Merge pull request #267330 from tamram/tamram-263328
new PR: Added AI toolchain operator doc back into AKS documentation
2 parents a0591c7 + 97af8f4 commit 3553031

File tree

4 files changed

+251
-6
lines changed

4 files changed

+251
-6
lines changed

.openpublishing.redirection.azure-kubernetes-service.json

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -461,11 +461,6 @@
461461
"source_path_from_root": "/articles/aks/command-invoke.md",
462462
"redirect_url": "/azure/aks/access-private-cluster",
463463
"redirect_document_id": false
464-
},
465-
{
466-
"source_path_from_root": "/articles/aks/ai-toolchain-operator.md",
467-
"redirect_url": "https://azure.microsoft.com/updates/preview-ai-toolchain-operator-addon-for-aks/",
468-
"redirect_document_id": false
469464
}
470465
]
471466
}

articles/aks/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -783,6 +783,8 @@
783783
href: open-ai-quickstart.md
784784
- name: Secure access to Azure OpenAI from Azure Kubernetes Service (AKS)
785785
href: open-ai-secure-access-quickstart.md
786+
- name: Deploy an AI model with the AI toolchain operator
787+
href: ai-toolchain-operator.md
786788
- name: DevOps
787789
items:
788790
- name: Azure DevOps Project

articles/aks/ai-toolchain-operator.md

Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
---
2+
title: Deploy an AI model on Azure Kubernetes Service (AKS) with the AI toolchain operator (preview)
3+
description: Learn how to enable the AI toolchain operator add-on on Azure Kubernetes Service (AKS) to simplify OSS AI model management and deployment.
4+
ms.topic: article
5+
ms.custom: azure-kubernetes-service
6+
ms.date: 02/28/2024
7+
---
8+
9+
# Deploy an AI model on Azure Kubernetes Service (AKS) with the AI toolchain operator (preview)
10+
11+
The AI toolchain operator (KAITO) is a managed add-on for AKS that simplifies the experience of running OSS AI models on your AKS clusters. The AI toolchain operator automatically provisions the necessary GPU nodes and sets up the associated inference server as an endpoint server to your AI models. Using this add-on reduces your onboarding time and enables you to focus on AI model usage and development rather than infrastructure setup.
12+
13+
This article shows you how to enable the AI toolchain operator add-on and deploy an AI model on AKS.
14+
15+
[!INCLUDE [preview features callout](./includes/preview/preview-callout.md)]
16+
17+
## Before you begin
18+
19+
* This article assumes a basic understanding of Kubernetes concepts. For more information, see [Kubernetes core concepts for AKS](./concepts-clusters-workloads.md).
20+
* For ***all hosted model inference images*** and recommended infrastructure setup, see the [KAITO GitHub repository](https://github.com/Azure/kaito).
21+
22+
## Prerequisites
23+
24+
* If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
25+
* If you have multiple Azure subscriptions, make sure you select the correct subscription in which the resources will be created and charged using the [az account set][az-account-set] command.
26+
27+
> [!NOTE]
28+
> The subscription you use must have GPU VM quota.
29+
30+
* Azure CLI version 2.47.0 or later installed and configured. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI](/cli/azure/install-azure-cli).
31+
* The Kubernetes command-line client, kubectl, installed and configured. For more information, see [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/).
32+
* [Install the Azure CLI AKS preview extension](#install-the-azure-cli-preview-extension).
33+
* [Register the AI toolchain operator add-on feature flag](#register-the-ai-toolchain-operator-add-on-feature-flag).
34+
35+
### Install the Azure CLI preview extension
36+
37+
1. Install the Azure CLI preview extension using the [az extension add][az-extension-add] command.
38+
39+
```azurecli-interactive
40+
az extension add --name aks-preview
41+
```
42+
43+
2. Update the extension to make sure you have the latest version using the [az extension update][az-extension-update] command.
44+
45+
```azurecli-interactive
46+
az extension update --name aks-preview
47+
```
48+
49+
### Register the AI toolchain operator add-on feature flag
50+
51+
1. Register the AIToolchainOperatorPreview feature flag using the [az feature register][az-feature-register] command.
52+
53+
```azurecli-interactive
54+
az feature register --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview"
55+
```
56+
57+
It takes a few minutes for the registration to complete.
58+
59+
2. Verify the registration using the [az feature show][az-feature-show] command.
60+
61+
```azurecli-interactive
62+
az feature show --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview"
63+
```
64+
65+
### Export environment variables
66+
67+
* To simplify the configuration steps in this article, you can define environment variables using the following commands. Make sure to replace the placeholder values with your own.
68+
69+
```azurecli-interactive
70+
export AZURE_SUBSCRIPTION_ID="mySubscriptionID"
71+
export AZURE_RESOURCE_GROUP="myResourceGroup"
72+
export AZURE_LOCATION="myLocation"
73+
export CLUSTER_NAME="myClusterName"
74+
```
75+
76+
## Enable the AI toolchain operator add-on on an AKS cluster
77+
78+
The following sections describe how to create an AKS cluster with the AI toolchain operator add-on enabled and deploy a default hosted AI model.
79+
80+
### Create an AKS cluster with the AI toolchain operator add-on enabled
81+
82+
1. Create an Azure resource group using the [az group create][az-group-create] command.
83+
84+
```azurecli-interactive
85+
az group create --name ${AZURE_RESOURCE_GROUP} --location ${AZURE_LOCATION}
86+
```
87+
88+
2. Create an AKS cluster with the AI toolchain operator add-on enabled using the [az aks create][az-aks-create] command with the `--enable-ai-toolchain-operator` and `--enable-oidc-issuer` flags.
89+
90+
```azurecli-interactive
91+
az aks create --location ${AZURE_LOCATION} \
92+
--resource-group ${AZURE_RESOURCE_GROUP} \
93+
--name ${CLUSTER_NAME} \
94+
--enable-oidc-issuer \
95+
--enable-ai-toolchain-operator
96+
```
97+
98+
> [!NOTE]
99+
> AKS creates a managed identity once you enable the AI toolchain operator add-on. The managed identity is used to create GPU node pools in the managed AKS cluster. Proper permissions need to be set for it manually following the steps introduced in the following sections.
100+
>
101+
> AI toolchain operator enablement requires the enablement of OIDC issuer.
102+
103+
3. On an existing AKS cluster, you can enable the AI toolchain operator add-on using the [az aks update][az-aks-update] command.
104+
105+
```azurecli-interactive
106+
az aks update --name ${CLUSTER_NAME} \
107+
--resource-group ${AZURE_RESOURCE_GROUP} \
108+
--enable-oidc-issuer \
109+
--enable-ai-toolchain-operator
110+
```
111+
112+
## Connect to your cluster
113+
114+
1. Configure `kubectl` to connect to your cluster using the [az aks get-credentials][az-aks-get-credentials] command.
115+
116+
```azurecli-interactive
117+
az aks get-credentials --resource-group ${AZURE_RESOURCE_GROUP} --name ${CLUSTER_NAME}
118+
```
119+
120+
2. Verify the connection to your cluster using the `kubectl get` command.
121+
122+
```azurecli-interactive
123+
kubectl get nodes
124+
```
125+
126+
## Export environment variables
127+
128+
* Export environment variables for the MC resource group, principal ID identity, and KAITO identity using the following commands:
129+
130+
```azurecli-interactive
131+
export MC_RESOURCE_GROUP=$(az aks show --resource-group ${AZURE_RESOURCE_GROUP} \
132+
--name ${CLUSTER_NAME} \
133+
--query nodeResourceGroup \
134+
-o tsv)
135+
export PRINCIPAL_ID=$(az identity show --name "ai-toolchain-operator-${CLUSTER_NAME}" \
136+
--resource-group "${MC_RESOURCE_GROUP}" \
137+
--query 'principalId'
138+
-o tsv)
139+
export KAITO_IDENTITY_NAME="ai-toolchain-operator-${CLUSTER_NAME}"
140+
```
141+
142+
## Get the AKS OpenID Connect (OIDC) Issuer
143+
144+
* Get the AKS OIDC Issuer URL and export it as an environment variable:
145+
146+
```azurecli-interactive
147+
export AKS_OIDC_ISSUER=$(az aks show --resource-group "${AZURE_RESOURCE_GROUP}" \
148+
--name "${CLUSTER_NAME}" \
149+
--query "oidcIssuerProfile.issuerUrl" \
150+
-o tsv)
151+
```
152+
153+
## Create role assignment for the service principal
154+
155+
* Create a new role assignment for the service principal using the [az role assignment create][az-role-assignment-create] command.
156+
157+
```azurecli-interactive
158+
az role assignment create --role "Contributor" \
159+
--assignee "${PRINCIPAL_ID}" \
160+
--scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
161+
```
162+
163+
## Establish a federated identity credential
164+
165+
* Create the federated identity credential between the managed identity, AKS OIDC issuer, and subject using the [az identity federated-credential create][az-identity-federated-credential-create] command.
166+
167+
```azurecli-interactive
168+
az identity federated-credential create --name "kaito-federated-identity" \
169+
--identity-name "${KAITO_IDENTITY_NAME}" \
170+
-g "${MC_RESOURCE_GROUP}" \
171+
--issuer "${AKS_OIDC_ISSUER}" \
172+
--subject system:serviceaccount:"kube-system:kaito-gpu-provisioner" \
173+
--audience api://AzureADTokenExchange
174+
```
175+
176+
## Verify that your deployment is running
177+
178+
1. Restart the KAITO GPU provisioner deployment on your pods using the `kubectl rollout restart` command:
179+
180+
```azurecli-interactive
181+
kubectl rollout restart deployment/kaito-gpu-provisioner -n kube-system
182+
```
183+
184+
2. Verify that the deployment is running using the `kubectl get` command:
185+
186+
```azurecli-interactive
187+
kubectl get deployment -n kube-system | grep kaito
188+
```
189+
190+
## Deploy a default hosted AI model
191+
192+
1. Deploy the Falcon 7B model YAML file from the GitHub repository using the `kubectl apply` command.
193+
194+
```azurecli-interactive
195+
kubectl apply -f https://raw.githubusercontent.com/Azure/kaito/main/examples/kaito_workspace_falcon_7b.yaml
196+
```
197+
198+
2. Track the live resource changes in your workspace using the `kubectl get` command.
199+
200+
```azurecli-interactive
201+
kubectl get workspace workspace-falcon-7b -w
202+
```
203+
204+
> [!NOTE]
205+
> As you track the live resource changes in your workspace, note that machine readiness can take up to 10 minutes, and workspace readiness up to 20 minutes.
206+
207+
3. Check your service and get the service IP address using the `kubectl get svc` command.
208+
209+
```azurecli-interactive
210+
export SERVICE_IP=$(kubectl get svc workspace-falcon-7b -o jsonpath='{.spec.clusterIP}')
211+
```
212+
213+
4. Run the Falcon 7B model with a sample input of your choice using the following `curl` command:
214+
215+
```azurecli-interactive
216+
kubectl run -it --rm --restart=Never curl --image=curlimages/curl -- curl -X POST http://$CLUSTERIP/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{"prompt":"YOUR QUESTION HERE"}"
217+
```
218+
219+
## Clean up resources
220+
221+
If you no longer need these resources, you can delete them to avoid incurring extra Azure charges.
222+
223+
* Delete the resource group and its associated resources using the [az group delete][az-group-delete] command.
224+
225+
```azurecli-interactive
226+
az group delete --name "${AZURE_RESOURCE_GROUP}" --yes --no-wait
227+
```
228+
229+
## Next steps
230+
231+
For more inference model options, see the [KAITO GitHub repository](https://github.com/Azure/kaito).
232+
233+
<!-- LINKS -->
234+
[az-group-create]: /cli/azure/group#az_group_create
235+
[az-group-delete]: /cli/azure/group#az_group_delete
236+
[az-aks-create]: /cli/azure/aks#az_aks_create
237+
[az-aks-update]: /cli/azure/aks#az_aks_update
238+
[az-aks-get-credentials]: /cli/azure/aks#az_aks_get_credentials
239+
[az-role-assignment-create]: /cli/azure/role/assignment#az_role_assignment_create
240+
[az-identity-federated-credential-create]: /cli/azure/identity/federated-credential#az_identity_federated_credential_create
241+
[az-account-set]: /cli/azure/account#az_account_set
242+
[az-extension-add]: /cli/azure/extension#az_extension_add
243+
[az-extension-update]: /cli/azure/extension#az_extension_update
244+
[az-feature-register]: /cli/azure/feature#az_feature_register
245+
[az-feature-show]: /cli/azure/feature#az_feature_show
246+
[az-provider-register]: /cli/azure/provider#az_provider_register

articles/aks/index.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ metadata:
1515

1616
landingContent:
1717
# Cards and links should be based on top customer tasks or top subjects
18-
# Start card title with a verb
18+
# Start card title with a verb
1919
# Card (optional)
2020
- title: About Azure Kubernetes Service (AKS)
2121
linkLists:
@@ -27,6 +27,8 @@ landingContent:
2727
url: /azure/architecture/reference-architectures/containers/aks-start-here
2828
- linkListType: whats-new
2929
links:
30+
- text: Deploy an AI model on AKS with the AI toolchain operator (Preview)
31+
url: ai-toolchain-operator.md
3032
- text: Upgrade multiple AKS clusters using Azure Kubernetes Fleet Manager
3133
url: /azure/kubernetes-fleet/update-orchestration
3234
- text: Reduce image pull time with Artifact Streaming on AKS (Preview)

0 commit comments

Comments
 (0)