Skip to content

Commit a7cc56c

Browse files
author
pjsingh28
authored
Merge pull request #371 from MicrosoftDocs/test_postgres
Merge test_postgres to main
2 parents 1814cd1 + 378b9f7 commit a7cc56c

File tree

7 files changed

+472
-1
lines changed

7 files changed

+472
-1
lines changed

scenarios/AksKaito/README.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
title: Deploy an AI model on Azure Kubernetes Service (AKS) with the AI toolchain operator (preview)
3+
description: Learn how to enable the AI toolchain operator add-on on Azure Kubernetes Service (AKS) to simplify OSS AI model management and deployment.
4+
ms.topic: article
5+
ms.custom: azure-kubernetes-service, devx-track-azurecli
6+
ms.date: 02/28/2024
7+
author: schaffererin
8+
ms.author: schaffererin
9+
10+
---
11+
12+
## Deploy an AI model on Azure Kubernetes Service (AKS) with the AI toolchain operator (preview)
13+
14+
The AI toolchain operator (KAITO) is a managed add-on for AKS that simplifies the experience of running OSS AI models on your AKS clusters. The AI toolchain operator automatically provisions the necessary GPU nodes and sets up the associated inference server as an endpoint server to your AI models. Using this add-on reduces your onboarding time and enables you to focus on AI model usage and development rather than infrastructure setup.
15+
16+
This article shows you how to enable the AI toolchain operator add-on and deploy an AI model on AKS.
17+
18+
[!INCLUDE [preview features callout](~/reusable-content/ce-skilling/azure/includes/aks/includes/preview/preview-callout.md)]
19+
20+
## Before you begin
21+
22+
* This article assumes a basic understanding of Kubernetes concepts. For more information, see [Kubernetes core concepts for AKS](./concepts-clusters-workloads.md).
23+
* For ***all hosted model inference images*** and recommended infrastructure setup, see the [KAITO GitHub repository](https://github.com/Azure/kaito).
24+
* The AI toolchain operator add-on currently supports KAITO version **v0.1.0**, please make a note of this in considering your choice of model from the KAITO model repository.
25+
26+
## Prerequisites
27+
28+
* If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
29+
* If you have multiple Azure subscriptions, make sure you select the correct subscription in which the resources will be created and charged using the [az account set](https://learn.microsoft.com/en-us/cli/azure/account?view=azure-cli-latest#az-account-set) command.
30+
31+
> [!NOTE]
32+
> The subscription you use must have GPU VM quota.
33+
34+
* Azure CLI version 2.47.0 or later installed and configured. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI](/cli/azure/install-azure-cli).
35+
* The Kubernetes command-line client, kubectl, installed and configured. For more information, see [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/).
36+
* [Install the Azure CLI AKS preview extension](#install-the-azure-cli-preview-extension).
37+
* [Register the AI toolchain operator add-on feature flag](#register-the-ai-toolchain-operator-add-on-feature-flag).
38+
39+
## Set up resource group
40+
41+
Set up a resource group with a random ID. Create an Azure resource group using the [az group create](https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest#az-group-create) command.
42+
43+
```bash
44+
export RANDOM_ID="$(openssl rand -hex 3)"
45+
export AZURE_RESOURCE_GROUP="myKaitoResourceGroup$RANDOM_ID"
46+
export REGION="centralus"
47+
export CLUSTER_NAME="myClusterName$RANDOM_ID"
48+
49+
az group create \
50+
--name $AZURE_RESOURCE_GROUP \
51+
--location $REGION \
52+
```
53+
54+
## Install the Azure CLI preview extension
55+
56+
Install the Azure CLI preview extension using the [az extension add](https://learn.microsoft.com/en-us/cli/azure/extension?view=azure-cli-latest#az-extension-add) command. Then update the extension to make sure you have the latest version using the [az extension update](https://learn.microsoft.com/en-us/cli/azure/extension?view=azure-cli-latest#az-extension-update) command.
57+
58+
```bash
59+
az extension add --name aks-preview
60+
az extension update --name aks-preview
61+
```
62+
63+
## Register the AI toolchain operator add-on feature flag
64+
65+
Register the AIToolchainOperatorPreview feature flag using the az feature register command.
66+
It takes a few minutes for the registration to complete.
67+
68+
```bash
69+
az feature register --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview"
70+
```
71+
72+
## Verify the AI toolchain operator add-on registration
73+
74+
Verify the registration using the [az feature show](https://learn.microsoft.com/en-us/cli/azure/feature?view=azure-cli-latest#az-feature-show) command.
75+
76+
```bash
77+
while true; do
78+
status=$(az feature show --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview" --query "properties.state" -o tsv)
79+
if [ "$status" == "Registered" ]; then
80+
break
81+
else
82+
sleep 15
83+
fi
84+
done
85+
```
86+
87+
## Create an AKS cluster with the AI toolchain operator add-on enabled
88+
89+
Create an AKS cluster with the AI toolchain operator add-on enabled using the [az aks create](https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-create) command with the `--enable-ai-toolchain-operator` and `--enable-oidc-issuer` flags.
90+
91+
```bash
92+
az aks create --location ${REGION} \
93+
--resource-group ${AZURE_RESOURCE_GROUP} \
94+
--name ${CLUSTER_NAME} \
95+
--enable-oidc-issuer \
96+
--node-os-upgrade-channel SecurityPatch \
97+
--auto-upgrade-channel stable \
98+
--enable-ai-toolchain-operator \
99+
--generate-ssh-keys \
100+
--k8s-support-plan KubernetesOfficial
101+
```
102+
103+
## Connect to your cluster
104+
105+
Configure `kubectl` to connect to your cluster using the [az aks get-credentials](https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-get-credentials) command.
106+
107+
```bash
108+
az aks get-credentials --resource-group ${AZURE_RESOURCE_GROUP} --name ${CLUSTER_NAME}
109+
```
110+
111+
## Establish a federated identity credential
112+
113+
Create the federated identity credential between the managed identity, AKS OIDC issuer, and subject using the [az identity federated-credential create](https://learn.microsoft.com/en-us/cli/azure/identity/federated-credential?view=azure-cli-latest) command.
114+
115+
```bash
116+
export MC_RESOURCE_GROUP=$(az aks show --resource-group ${AZURE_RESOURCE_GROUP} \
117+
--name ${CLUSTER_NAME} \
118+
--query nodeResourceGroup \
119+
-o tsv)
120+
export KAITO_IDENTITY_NAME="ai-toolchain-operator-${CLUSTER_NAME}"
121+
export AKS_OIDC_ISSUER=$(az aks show --resource-group "${AZURE_RESOURCE_GROUP}" \
122+
--name "${CLUSTER_NAME}" \
123+
--query "oidcIssuerProfile.issuerUrl" \
124+
-o tsv)
125+
126+
az identity federated-credential create --name "kaito-federated-identity" \
127+
--identity-name "${KAITO_IDENTITY_NAME}" \
128+
-g "${MC_RESOURCE_GROUP}" \
129+
--issuer "${AKS_OIDC_ISSUER}" \
130+
--subject system:serviceaccount:"kube-system:kaito-gpu-provisioner" \
131+
--audience api://AzureADTokenExchange
132+
```
133+
134+
## Verify that your deployment is running
135+
136+
Restart the KAITO GPU provisioner deployment on your pods using the `kubectl rollout restart` command:
137+
138+
```bash
139+
kubectl rollout restart deployment/kaito-gpu-provisioner -n kube-system
140+
```
141+
142+
## Deploy a default hosted AI model
143+
144+
Deploy the Falcon 7B-instruct model from the KAITO model repository using the `kubectl apply` command.
145+
146+
```bash
147+
kubectl apply -f https://raw.githubusercontent.com/Azure/kaito/main/examples/inference/kaito_workspace_falcon_7b-instruct.yaml
148+
```
149+
150+
## Ask a question
151+
152+
Verify deployment done: `kubectl get workspace workspace-falcon-7b-instruct -w`.
153+
Store IP: `export SERVICE_IP=$(kubectl get svc workspace-falcon-7b-instruct -o jsonpath='{.spec.clusterIP}')`.
154+
Ask question: `kubectl run -it --rm --restart=Never curl --image=curlimages/curl -- curl -X POST http://$SERVICE_IP/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\":\"YOUR QUESTION HERE\"}"`
155+
156+
```bash
157+
echo "See last step for details on how to ask questions to the model."
158+
```
159+
160+
## Next steps
161+
162+
For more inference model options, see the [KAITO GitHub repository](https://github.com/Azure/kaito).
Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
---
2+
title: 'Quickstart: Deploy a Postgres vector database'
3+
description: Setup a Postgres vector database and openai resources to run a RAG-LLM model.
4+
ms.topic: quickstart
5+
ms.date: 09/06/2024
6+
author: aamini7
7+
ms.author: ariaamini
8+
ms.custom: innovation-engine, linux-related-content
9+
---
10+
11+
## Introduction
12+
13+
In this doc, we go over how to host the infrastructure required to run a basic LLM model with RAG capabilities on Azure.
14+
We first set up a Postgres database capable of storing vector embeddings for documents/knowledge files that we want to use to
15+
augment our queries. We then create an Azure OpenAI deployment capable of generating embeddings and answering questions using the latest 'gpt-4-turbo' model.
16+
We then use a python script to fill our postgres database with embeddings from a sample "knowledge.txt" file containing information about an imaginary
17+
resource called 'Zytonium'. Once the database is filled with those embeddings, we use the same python script to answer any
18+
questions we have about 'Zytonium'. The script will search the database for relevant information for our query using an embeddings search and
19+
then augment our query with that relevant information before being sent our LLM to answer.
20+
21+
## Set up resource group
22+
23+
Set up a resource group with a random ID.
24+
25+
```bash
26+
export RANDOM_ID="$(openssl rand -hex 3)"
27+
export RG_NAME="myPostgresResourceGroup$RANDOM_ID"
28+
export REGION="centralus"
29+
30+
az group create \
31+
--name $RG_NAME \
32+
--location $REGION \
33+
```
34+
35+
## Create OpenAI resources
36+
37+
Create the openai resource
38+
39+
```bash
40+
export OPEN_AI_SERVICE_NAME="openai-service-$RANDOM_ID"
41+
export EMBEDDING_MODEL="text-embedding-ada-002"
42+
export CHAT_MODEL="gpt-4-turbo-2024-04-09"
43+
44+
az cognitiveservices account create \
45+
--name $OPEN_AI_SERVICE_NAME \
46+
--resource-group $RG_NAME \
47+
--location westus \
48+
--kind OpenAI \
49+
--sku s0 \
50+
```
51+
52+
## Create OpenAI deployments
53+
54+
```bash
55+
export EMBEDDING_MODEL="text-embedding-ada-002"
56+
export CHAT_MODEL="gpt-4"
57+
58+
az cognitiveservices account deployment create \
59+
--name $OPEN_AI_SERVICE_NAME \
60+
--resource-group $RG_NAME \
61+
--deployment-name $EMBEDDING_MODEL \
62+
--model-name $EMBEDDING_MODEL \
63+
--model-version "2" \
64+
--model-format OpenAI \
65+
--sku-capacity "1" \
66+
--sku-name "Standard"
67+
68+
az cognitiveservices account deployment create \
69+
--name $OPEN_AI_SERVICE_NAME \
70+
--resource-group $RG_NAME \
71+
--deployment-name $CHAT_MODEL \
72+
--model-name $CHAT_MODEL \
73+
--model-version "turbo-2024-04-09" \
74+
--model-format OpenAI \
75+
--sku-capacity "1" \
76+
--sku-name "Standard"
77+
```
78+
79+
## Create Database
80+
81+
Create an Azure postgres database.
82+
83+
```bash
84+
export POSTGRES_SERVER_NAME="mydb$RANDOM_ID"
85+
export PGHOST="${POSTGRES_SERVER_NAME}.postgres.database.azure.com"
86+
export PGUSER="dbadmin$RANDOM_ID"
87+
export PGPORT=5432
88+
export PGDATABASE="azure-ai-demo"
89+
export PGPASSWORD="$(openssl rand -base64 32)"
90+
91+
az postgres flexible-server create \
92+
--admin-password $PGPASSWORD \
93+
--admin-user $PGUSER \
94+
--location $REGION \
95+
--name $POSTGRES_SERVER_NAME \
96+
--database-name $PGDATABASE \
97+
--resource-group $RG_NAME \
98+
--sku-name Standard_B2s \
99+
--storage-auto-grow Disabled \
100+
--storage-size 32 \
101+
--tier Burstable \
102+
--version 16 \
103+
--yes -o JSON \
104+
--public-access 0.0.0.0
105+
```
106+
107+
## Enable postgres vector extension
108+
109+
Set up the vector extension for postgres to allow storing vectors/embeddings.
110+
111+
```bash
112+
az postgres flexible-server parameter set \
113+
--resource-group $RG_NAME \
114+
--server-name $POSTGRES_SERVER_NAME \
115+
--name azure.extensions --value vector
116+
117+
psql -c "CREATE EXTENSION IF NOT EXISTS vector;"
118+
119+
psql \
120+
-c "CREATE TABLE embeddings(id int PRIMARY KEY, data text, embedding vector(1536));" \
121+
-c "CREATE INDEX ON embeddings USING hnsw (embedding vector_ip_ops);"
122+
```
123+
124+
## Populate with data from knowledge file
125+
126+
The chat bot uses a local file called "knowledge.txt" as the sample document to generate embeddings for
127+
and to store those embeddings in the newly created postgres database. Then any questions you ask will
128+
be augmented with context from the "knowledge.txt" after searching the document for the most relevant
129+
pieces of context using the embeddings. The "knowledge.txt" is about a fictional material called Zytonium.
130+
You can view the full knowledge.txt and the code for the chatbot by looking in the "scenarios/PostgresRagLlmDemo" directory.
131+
132+
```bash
133+
export ENDPOINT=$(az cognitiveservices account show --name $OPEN_AI_SERVICE_NAME --resource-group $RG_NAME | jq -r .properties.endpoint)
134+
export API_KEY=$(az cognitiveservices account keys list --name $OPEN_AI_SERVICE_NAME --resource-group $RG_NAME | jq -r .key1)
135+
136+
cd ~/scenarios/PostgresRagLlmDemo
137+
pip install -r requirements.txt
138+
python chat.py --populate --api-key $API_KEY --endpoint $ENDPOINT --pguser $PGUSER --phhost $PGHOST --pgpassword $PGPASSWORD --pgdatabase $PGDATABASE
139+
```
140+
141+
## Run Chat bot
142+
143+
This final step prints out the command you can copy/paste into the terminal to run the chatbot. `cd ~/scenarios/PostgresRagLlmDemo && python chat.py --api-key $API_KEY --endpoint $ENDPOINT --pguser $PGUSER --phhost $PGHOST --pgpassword $PGPASSWORD --pgdatabase $PGDATABASE`
144+
145+
```bash
146+
echo "
147+
To run the chatbot, see the last step for more info.
148+
"
149+
```

0 commit comments

Comments
 (0)