Skip to content

Commit 165f4df

Browse files
committed
Semantic kernel
1 parent 5727a0c commit 165f4df

File tree

2 files changed

+238
-12
lines changed

2 files changed

+238
-12
lines changed
Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,222 @@
1+
---
2+
title: Develop applications with Semantic Kernel and Azure AI Foundry
3+
titleSuffix: Azure AI Foundry
4+
description: Develop applications with Semantic Kernel and Azure AI Foundry.
5+
author: lgayhardt
6+
ms.author: lagayhar
7+
ms.reviewer: taochen
8+
ms.date: 12/04/2024
9+
ms.topic: how-to
10+
ms.service: azure-ai-studio
11+
manager: scottpolly
12+
---
13+
14+
# Develop applications with Semantic Kernel and Azure AI Foundry
15+
16+
In this article, you learn how to use [Semantic Kernel](/semantic-kernel/overview/) with models deployed from the Azure AI model catalog in Azure AI Foundry portal.
17+
18+
## Prerequisites
19+
20+
- An [Azure subscription](https://azure.microsoft.com).
21+
- An Azure AI project as explained at [Create a project in Azure AI Foundry portal](../create-projects.md).
22+
- A model supporting the [Azure AI model inference API](../../reference/reference-model-inference-api?tabs=python) deployed. In this example, we use a `Mistral-Large` deployment, but use any model of your preference. For using embeddings capabilities in LlamaIndex, you need an embedding model like `cohere-embed-v3-multilingual`.
23+
24+
- You can follow the instructions at [Deploy models as serverless APIs](../deploy-models-serverless.md).
25+
26+
- Python **3.10** or later installed, including pip.
27+
- Semantic Kernel installed. You can do it with:
28+
29+
```bash
30+
pip install semantic-kernel
31+
```
32+
33+
## Configure the environment
34+
35+
To use LLMs deployed in Azure AI Foundry portal, you need the endpoint and credentials to connect to it. Follow these steps to get the information you need from the model you want to use:
36+
37+
1. Go to the [Azure AI Foundry portal](https://ai.azure.com/).
38+
1. Open the project where the model is deployed, if it isn't already open.
39+
1. Go to **Models + endpoints** and select the model you deployed as indicated in the prerequisites.
40+
1. Copy the endpoint URL and the key.
41+
42+
:::image type="content" source="../../media/how-to/inference/serverless-endpoint-url-keys.png" alt-text="Screenshot of the option to copy endpoint URI and keys from an endpoint." lightbox="../../media/how-to/inference/serverless-endpoint-url-keys.png":::
43+
44+
> [!TIP]
45+
> If your model was deployed with Microsoft Entra ID support, you don't need a key.
46+
47+
In this scenario, we placed both the endpoint URL and key in the following environment variables:
48+
49+
```bash
50+
export AZURE_AI_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
51+
export AZURE_AI_INFERENCE_API_KEY="<your-key-goes-here>"
52+
```
53+
54+
Once configured, create a client to connect to the endpoint:
55+
56+
```python
57+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
58+
59+
chat_completion_service = AzureAIInferenceChatCompletion(ai_model_id="<deployment-name>")
60+
```
61+
62+
> [!TIP]
63+
> The client automatically reads the environment variables `AZURE_AI_INFERENCE_ENDPOINT` and `AZURE_AI_INFERENCE_API_KEY` to connect to the model. However, you can also pass the endpoint and key directly to the client via the `endpoint` and `api_key` parameters on the constructor.
64+
65+
Alternatively, if your endpoint support Microsoft Entra ID, you can use the following code to create the client:
66+
67+
```bash
68+
export AZURE_AI_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
69+
```
70+
71+
```python
72+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
73+
74+
chat_completion_service = AzureAIInferenceChatCompletion(ai_model_id="<deployment-name>")
75+
```
76+
77+
> [!NOTE]
78+
> When using Microsoft Entra ID, make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
79+
80+
### Azure OpenAI models
81+
82+
If you're using an Azure OpenAI model, you can use the following code to create the client:
83+
84+
```python
85+
from azure.ai.inference.aio import ChatCompletionsClient
86+
from azure.identity.aio import DefaultAzureCredential
87+
88+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
89+
90+
chat_completion_service = AzureAIInferenceChatCompletion(
91+
ai_model_id="<deployment-name>",
92+
client=ChatCompletionsClient(
93+
endpoint=f"{str(<your-azure-open-ai-endpoint>).strip('/')}/openai/deployments/{<deployment_name>}",
94+
credential=DefaultAzureCredential(),
95+
credential_scopes=["https://cognitiveservices.azure.com/.default"],
96+
),
97+
)
98+
```
99+
100+
## Inference parameters
101+
102+
You can configure how inference is performed by using the `AzureAIInferenceChatPromptExecutionSettings` class:
103+
104+
```python
105+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatPromptExecutionSettings
106+
107+
execution_settings = AzureAIInferenceChatPromptExecutionSettings(
108+
max_tokens=100,
109+
temperature=0.5,
110+
top_p=0.9,
111+
# extra_parameters={...}, # model-specific parameters
112+
)
113+
```
114+
115+
## Calling the service
116+
117+
Let's first call the chat completion service with a simple chat history:
118+
119+
> [!TIP]
120+
> Semantic Kernel is an asynchronous library, so you need to use the asyncio library to run the code.
121+
>
122+
> ```python
123+
> import asyncio
124+
>
125+
> async def main():
126+
> ...
127+
>
128+
> if __name__ == "__main__":
129+
> asyncio.run(main())
130+
> ```
131+
132+
```python
133+
from semantic_kernel.contents.chat_history import ChatHistory
134+
135+
chat_history = ChatHistory()
136+
chat_history.add_user_message("Hello, how are you?")
137+
138+
response = await chat_completion.get_chat_message_content(
139+
chat_history=chat_history,
140+
settings=execution_settings,
141+
)
142+
print(response)
143+
```
144+
145+
Alternatively, you can stream the response from the service:
146+
147+
```python
148+
chat_history = ChatHistory()
149+
chat_history.add_user_message("Hello, how are you?")
150+
151+
response = chat_completion.get_streaming_chat_message_content(
152+
chat_history=chat_history,
153+
settings=execution_settings,
154+
)
155+
156+
chunks = []
157+
async for chunk in response:
158+
chunks.append(chunk)
159+
print(chunk, end="")
160+
161+
full_response = sum(chunks[1:], chunks[0])
162+
```
163+
164+
### Create a long-running conversation
165+
166+
You can create a long-running conversation by using a loop:
167+
168+
```python
169+
while True:
170+
response = await chat_completion.get_chat_message_content(
171+
chat_history=chat_history,
172+
settings=execution_settings,
173+
)
174+
print(response)
175+
chat_history.add_message(response)
176+
chat_history.add_user_message(user_input = input("User:> "))
177+
```
178+
179+
If you're streaming the response, you can use the following code:
180+
181+
```python
182+
while True:
183+
response = chat_completion.get_streaming_chat_message_content(
184+
chat_history=chat_history,
185+
settings=execution_settings,
186+
)
187+
188+
chunks = []
189+
async for chunk in response:
190+
chunks.append(chunk)
191+
print(chunk, end="")
192+
193+
full_response = sum(chunks[1:], chunks[0])
194+
chat_history.add_message(full_response)
195+
chat_history.add_user_message(user_input = input("User:> "))
196+
```
197+
198+
## Use embeddings models
199+
200+
Configure your environment similarly to the previous steps, but use the `AzureAIInferenceEmbeddings` class:
201+
202+
```python
203+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceTextEmbedding
204+
205+
embedding_generation_service = AzureAIInferenceTextEmbedding(ai_model_id="<deployment-name>")
206+
```
207+
208+
The following code shows how to get embeddings from the service:
209+
210+
```python
211+
embeddings = await embedding_generation_service.generate_embeddings(
212+
text=["My favorite color is blue.", "I love to eat pizza."],
213+
)
214+
215+
for embedding in embeddings:
216+
print(embedding)
217+
```
218+
219+
## Related content
220+
221+
- [How to get started with Azure AI SDKs](sdk-overview.md)
222+
- [Reference for Semantic Kernel model integration](/semantic-kernel/concepts/ai-services/)

articles/ai-studio/toc.yml

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,8 @@ items:
301301
- name: Develop with LlamaIndex
302302
href: how-to/develop/llama-index.md
303303
displayName: code,sdk
304+
- name: Develop with Semantic Kernel
305+
href: how-to/develop/semantic-kernel.md
304306
- name: Trace generative AI apps
305307
items:
306308
- name: Tracing overview
@@ -309,8 +311,6 @@ items:
309311
href: how-to/develop/trace-local-sdk.md
310312
- name: Visualize your traces
311313
href: how-to/develop/visualize-traces.md
312-
- name: Continuously monitor your applications
313-
href: how-to/online-evaluation.md
314314
- name: Evaluate generative AI apps
315315
items:
316316
- name: Evaluations concepts
@@ -341,16 +341,20 @@ items:
341341
href: concepts/a-b-experimentation.md
342342
- name: Deploy and monitor generative AI apps
343343
items:
344-
- name: Deploy a flow for real-time inference
345-
href: how-to/flow-deploy.md
346-
displayName: endpoint
347-
- name: Enable tracing and collect feedback for a flow deployment
348-
href: how-to/develop/trace-production-sdk.md
349-
displayName: code
350-
- name: Monitor prompt flow deployments
351-
href: how-to/monitor-quality-safety.md
352-
- name: Troubleshoot deployments and monitoring
353-
href: how-to/troubleshoot-deploy-and-monitor.md
344+
- name: Continuously monitor your applications
345+
href: how-to/online-evaluation.md
346+
- name: Deploy and monitor flows
347+
items:
348+
- name: Deploy a flow for real-time inference
349+
href: how-to/flow-deploy.md
350+
displayName: endpoint
351+
- name: Enable tracing and collect feedback for a flow deployment
352+
href: how-to/develop/trace-production-sdk.md
353+
displayName: code
354+
- name: Monitor prompt flow deployments
355+
href: how-to/monitor-quality-safety.md
356+
- name: Troubleshoot deployments and monitoring
357+
href: how-to/troubleshoot-deploy-and-monitor.md
354358
- name: Costs and quotas
355359
items:
356360
- name: Plan and manage costs

0 commit comments

Comments
 (0)