Skip to content

Commit 4041fce

Browse files
authored
Merge pull request #1961 from lgayhardt/semantickernel1224
AI Foundry: Semantic kernel
2 parents 7b84f05 + e6d6c2d commit 4041fce

File tree

2 files changed

+228
-0
lines changed

2 files changed

+228
-0
lines changed
Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
---
2+
title: Develop applications with Semantic Kernel and Azure AI Foundry
3+
titleSuffix: Azure AI Foundry
4+
description: Develop applications with Semantic Kernel and Azure AI Foundry.
5+
author: lgayhardt
6+
ms.author: lagayhar
7+
ms.reviewer: taochen
8+
ms.date: 12/04/2024
9+
ms.topic: how-to
10+
ms.service: azure-ai-studio
11+
manager: scottpolly
12+
---
13+
14+
# Develop applications with Semantic Kernel and Azure AI Foundry
15+
16+
In this article, you learn how to use [Semantic Kernel](/semantic-kernel/overview/) with models deployed from the Azure AI model catalog in Azure AI Foundry portal.
17+
18+
## Prerequisites
19+
20+
- An [Azure subscription](https://azure.microsoft.com).
21+
- An Azure AI project as explained at [Create a project in Azure AI Foundry portal](../create-projects.md).
22+
- A model supporting the [Azure AI model inference API](../../reference/reference-model-inference-api.md?tabs=python) deployed. In this example, we use a `Mistral-Large` deployment, but use any model of your preference. For using embeddings capabilities in LlamaIndex, you need an embedding model like `cohere-embed-v3-multilingual`.
23+
24+
- You can follow the instructions at [Deploy models as serverless APIs](../deploy-models-serverless.md).
25+
26+
- Python **3.10** or later installed, including pip.
27+
- Semantic Kernel installed. You can do it with:
28+
29+
```bash
30+
pip install semantic-kernel
31+
```
32+
- In this example, we are working with the Azure AI model inference API, hence we install the relevant Azure dependencies. You can do it with:
33+
```bash
34+
pip install semantic-kernel[azure]
35+
```
36+
37+
## Configure the environment
38+
39+
To use LLMs deployed in Azure AI Foundry portal, you need the endpoint and credentials to connect to it. Follow these steps to get the information you need from the model you want to use:
40+
41+
1. Go to the [Azure AI Foundry portal](https://ai.azure.com/).
42+
1. Open the project where the model is deployed, if it isn't already open.
43+
1. Go to **Models + endpoints** and select the model you deployed as indicated in the prerequisites.
44+
1. Copy the endpoint URL and the key.
45+
46+
:::image type="content" source="../../media/how-to/inference/serverless-endpoint-url-keys.png" alt-text="Screenshot of the option to copy endpoint URI and keys from an endpoint." lightbox="../../media/how-to/inference/serverless-endpoint-url-keys.png":::
47+
48+
> [!TIP]
49+
> If your model was deployed with Microsoft Entra ID support, you don't need a key.
50+
51+
In this scenario, we placed both the endpoint URL and key in the following environment variables:
52+
53+
```bash
54+
export AZURE_AI_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
55+
export AZURE_AI_INFERENCE_API_KEY="<your-key-goes-here>"
56+
```
57+
58+
Once configured, create a client to connect to the endpoint:
59+
60+
```python
61+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
62+
63+
chat_completion_service = AzureAIInferenceChatCompletion(ai_model_id="<deployment-name>")
64+
```
65+
66+
> [!TIP]
67+
> The client automatically reads the environment variables `AZURE_AI_INFERENCE_ENDPOINT` and `AZURE_AI_INFERENCE_API_KEY` to connect to the model. However, you can also pass the endpoint and key directly to the client via the `endpoint` and `api_key` parameters on the constructor.
68+
69+
Alternatively, if your endpoint support Microsoft Entra ID, you can use the following code to create the client:
70+
71+
```bash
72+
export AZURE_AI_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
73+
```
74+
75+
```python
76+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
77+
78+
chat_completion_service = AzureAIInferenceChatCompletion(ai_model_id="<deployment-name>")
79+
```
80+
81+
> [!NOTE]
82+
> When using Microsoft Entra ID, make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
83+
84+
### Azure OpenAI models
85+
86+
If you're using an Azure OpenAI model, you can use the following code to create the client:
87+
88+
```python
89+
from azure.ai.inference.aio import ChatCompletionsClient
90+
from azure.identity.aio import DefaultAzureCredential
91+
92+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
93+
94+
chat_completion_service = AzureAIInferenceChatCompletion(
95+
ai_model_id="<deployment-name>",
96+
client=ChatCompletionsClient(
97+
endpoint=f"{str(<your-azure-open-ai-endpoint>).strip('/')}/openai/deployments/{<deployment_name>}",
98+
credential=DefaultAzureCredential(),
99+
credential_scopes=["https://cognitiveservices.azure.com/.default"],
100+
),
101+
)
102+
```
103+
104+
## Inference parameters
105+
106+
You can configure how inference is performed by using the `AzureAIInferenceChatPromptExecutionSettings` class:
107+
108+
```python
109+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatPromptExecutionSettings
110+
111+
execution_settings = AzureAIInferenceChatPromptExecutionSettings(
112+
max_tokens=100,
113+
temperature=0.5,
114+
top_p=0.9,
115+
# extra_parameters={...}, # model-specific parameters
116+
)
117+
```
118+
119+
## Calling the service
120+
121+
Let's first call the chat completion service with a simple chat history:
122+
123+
> [!TIP]
124+
> Semantic Kernel is an asynchronous library, so you need to use the asyncio library to run the code.
125+
>
126+
> ```python
127+
> import asyncio
128+
>
129+
> async def main():
130+
> ...
131+
>
132+
> if __name__ == "__main__":
133+
> asyncio.run(main())
134+
> ```
135+
136+
```python
137+
from semantic_kernel.contents.chat_history import ChatHistory
138+
139+
chat_history = ChatHistory()
140+
chat_history.add_user_message("Hello, how are you?")
141+
142+
response = await chat_completion_service.get_chat_message_content(
143+
chat_history=chat_history,
144+
settings=execution_settings,
145+
)
146+
print(response)
147+
```
148+
149+
Alternatively, you can stream the response from the service:
150+
151+
```python
152+
chat_history = ChatHistory()
153+
chat_history.add_user_message("Hello, how are you?")
154+
155+
response = chat_completion_service.get_streaming_chat_message_content(
156+
chat_history=chat_history,
157+
settings=execution_settings,
158+
)
159+
160+
chunks = []
161+
async for chunk in response:
162+
chunks.append(chunk)
163+
print(chunk, end="")
164+
165+
full_response = sum(chunks[1:], chunks[0])
166+
```
167+
168+
### Create a long-running conversation
169+
170+
You can create a long-running conversation by using a loop:
171+
172+
```python
173+
while True:
174+
response = await chat_completion_service.get_chat_message_content(
175+
chat_history=chat_history,
176+
settings=execution_settings,
177+
)
178+
print(response)
179+
chat_history.add_message(response)
180+
chat_history.add_user_message(user_input = input("User:> "))
181+
```
182+
183+
If you're streaming the response, you can use the following code:
184+
185+
```python
186+
while True:
187+
response = chat_completion_service.get_streaming_chat_message_content(
188+
chat_history=chat_history,
189+
settings=execution_settings,
190+
)
191+
192+
chunks = []
193+
async for chunk in response:
194+
chunks.append(chunk)
195+
print(chunk, end="")
196+
197+
full_response = sum(chunks[1:], chunks[0])
198+
chat_history.add_message(full_response)
199+
chat_history.add_user_message(user_input = input("User:> "))
200+
```
201+
202+
## Use embeddings models
203+
204+
Configure your environment similarly to the previous steps, but use the `AzureAIInferenceEmbeddings` class:
205+
206+
```python
207+
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceTextEmbedding
208+
209+
embedding_generation_service = AzureAIInferenceTextEmbedding(ai_model_id="<deployment-name>")
210+
```
211+
212+
The following code shows how to get embeddings from the service:
213+
214+
```python
215+
embeddings = await embedding_generation_service.generate_embeddings(
216+
texts=["My favorite color is blue.", "I love to eat pizza."],
217+
)
218+
219+
for embedding in embeddings:
220+
print(embedding)
221+
```
222+
223+
## Related content
224+
225+
- [How to get started with Azure AI SDKs](sdk-overview.md)
226+
- [Reference for Semantic Kernel model integration](/semantic-kernel/concepts/ai-services/)

articles/ai-studio/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -303,6 +303,8 @@ items:
303303
- name: Develop with LlamaIndex
304304
href: how-to/develop/llama-index.md
305305
displayName: code,sdk
306+
- name: Develop with Semantic Kernel
307+
href: how-to/develop/semantic-kernel.md
306308
- name: Trace generative AI apps
307309
items:
308310
- name: Tracing overview

0 commit comments

Comments
 (0)