Skip to content

Commit dba7302

Browse files
committed
feat: SDKs
1 parent 3009ab7 commit dba7302

File tree

3 files changed

+2282
-0
lines changed

3 files changed

+2282
-0
lines changed
Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
---
2+
title: Develop application with LlamaIndex and Azure AI studio
3+
titleSuffix: Azure AI Studio
4+
description: This article explains how to use LlamaIndex with models deployed in Azure AI studio to build advance intelligent applications
5+
manager: nitinme
6+
ms.service: azure-ai-studio
7+
ms.topic: how-to
8+
ms.date: 9/14/2024
9+
ms.reviewer: fasantia
10+
ms.author: eur
11+
author: eric-urban
12+
---
13+
14+
# Develop application with LlamaIndex and Azure AI studio
15+
16+
In this article, you learn how to use [`llama-index`](https://github.com/run-llama/llama_index) with models deployed from the Azure AI model catalog deployed to Azure AI studio.
17+
18+
## Prerequisites
19+
20+
To run this tutorial you need:
21+
22+
1. An [Azure subscription](https://azure.microsoft.com).
23+
2. An Azure AI hub resource as explained at [How to create and manage an Azure AI Studio hub](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/create-azure-ai-resource).
24+
3. A model supporting the [Azure AI model inference API](https://aka.ms/azureai/modelinference) deployed. In this example we use a `Mistral-Large` deployment, but use any model of your preference. For using embeddings capabilities in LlamaIndex, you need an embedding model like Cohere Embed V3.
25+
26+
* You can follow the instructions at [Deploy models as serverless APIs](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-serverless).
27+
28+
4. A Python environment.
29+
30+
31+
## Install dependencies
32+
33+
Ensure you have `llama-index` installed:
34+
35+
```bash
36+
pip install llama-index
37+
```
38+
39+
Models deployed to Azure AI studio or Azure Machine Learning can be used with LlamaIndex in two ways:
40+
41+
- **Using the Azure AI model inference API:** All models deployed to Azure AI studio and Azure Machine Learning support the Azure AI model inference API, which offers a common set of functionalities that can be used for most of the models in the catalog. The benefit of this API is that, since it's the same for all the models, changing from one to another is as simple as changing the model deployment being use. No further changes are required in the code. When working with `llama-index`, install the extensions `llama-index-llms-azure-inference` and `llama-index-embeddings-azure-inference`.
42+
43+
- **Using the model's provider specific API:** Some models, like OpenAI, Cohere, or Mistral, offer their own set of APIs and extensions for `llama-index`. Those extensions may include specific functionalities that the model support and hence are suitable if you want to exploit them. When working with `llama-index`, install the extension specific for the model you want to use, like `llama-index-llms-openai` or `llama-index-llms-cohere`.
44+
45+
46+
In this example, we are working with the Azure AI model inference API, hence we install the following packages:
47+
48+
```bash
49+
pip install -U llama-index-llms-azure-inference
50+
pip install -U llama-index-embeddings-azure-inference
51+
```
52+
53+
## Configure the environment
54+
55+
To use LLMs deployed in Azure AI studio you need the endpoint and credentials to connect to it. The parameter `model_name` is not required for endpoints serving a single model, like Managed Online Endpoints. Follow this steps to get the information you need from the model you want to use:
56+
57+
1. Go to the [Azure AI studio](https://ai.azure.com/).
58+
2. Go to deployments and select the model you have deployed as indicated in the prerequisites.
59+
3. Copy the endpoint URL and the key.
60+
61+
> [!TIP]
62+
> If your model was deployed with Microsoft Entra ID support, you don't need a key.
63+
64+
In this scenario, we have placed both the endpoint URL and key in the following environment variables:
65+
66+
```bash
67+
export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
68+
export AZURE_INFERENCE_CREDENTIAL="<your-key-goes-here>"
69+
```
70+
71+
Once configured, create a client to connect to the endpoint:
72+
73+
```python
74+
import os
75+
from llama_index.llms.azure_inference import AzureAICompletionsModel
76+
77+
llm = AzureAICompletionsModel(
78+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
79+
credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
80+
)
81+
```
82+
83+
Alternatively, if you endpoint support Microsoft Entra ID, you can use the following code to create the client:
84+
85+
```python
86+
from azure.identity import DefaultAzureCredential
87+
88+
llm = AzureAICompletionsModel(
89+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
90+
credential=DefaultAzureCredential(),
91+
)
92+
```
93+
94+
> [!NOTE]
95+
> > Note: When using Microsoft Entra ID, make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
96+
97+
If you are planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials:
98+
99+
```python
100+
from azure.identity.aio import (
101+
DefaultAzureCredential as DefaultAzureCredentialAsync,
102+
)
103+
104+
llm = AzureAICompletionsModel(
105+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
106+
credential=DefaultAzureCredentialAsync(),
107+
)
108+
```
109+
110+
### Inference parameters
111+
112+
You can configure how inference in performed for all the operations that are using this client by setting extra parameters. This helps avoid indicating them on each call you make to the model.
113+
114+
```python
115+
llm = AzureAICompletionsModel(
116+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
117+
credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
118+
temperature=0.0,
119+
model_kwargs={"top_p": 1.0},
120+
)
121+
```
122+
123+
For parameters extra parameters that are not supported by the Azure AI model inference API but that are available in the underlying model, you can use the `model_extras` argument. In the following example, the parameter `safe_prompt`, only available for Mistral models, is being passed.
124+
125+
```python
126+
llm = AzureAICompletionsModel(
127+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
128+
credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
129+
temperature=0.0,
130+
model_kwargs={"model_extras": {"safe_prompt": True}},
131+
)
132+
```
133+
134+
## Use LLMs models
135+
136+
Use the `chat` endpoint for chat instruction models. The `complete` method is still available for model of type `chat-completions`. On those cases, your input text is converted to a message with `role="user"`.
137+
138+
```python
139+
from llama_index.core.llms import ChatMessage
140+
141+
messages = [
142+
ChatMessage(
143+
role="system", content="You are a pirate with colorful personality."
144+
),
145+
ChatMessage(role="user", content="Hello"),
146+
]
147+
148+
response = llm.chat(messages)
149+
print(response)
150+
```
151+
152+
You can stream the outputs also:
153+
154+
```python
155+
response = llm.stream_chat(messages)
156+
for r in response:
157+
print(r.delta, end="")
158+
```
159+
160+
## Use embeddings models
161+
162+
In the same way you create an LLM client, you can connect to an embedding model. In the following example, we are setting again the environment variable to now point to an embeddings model:
163+
164+
```bash
165+
export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
166+
export AZURE_INFERENCE_CREDENTIAL="<your-key-goes-here>"
167+
```
168+
169+
Then create the client:
170+
171+
```python
172+
from llama_index.embeddings.azure_inference import AzureAIEmbeddingsModel
173+
174+
embed_model = AzureAIEmbeddingsModel(
175+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
176+
credential=os.environ['AZURE_INFERENCE_CREDENTIAL'],
177+
)
178+
```
179+
180+
## Configure the models used by your code
181+
182+
You can use the LLM or embeddings model client individually in the code you develop with LlamaIndex or you can configure the entire session using the `Settings` options. Configuring the session has the advantage that then all your code will use the same models for all the operations.
183+
184+
```python
185+
from llama_index.core import Settings
186+
187+
Settings.llm = llm
188+
Settings.embed_model = embed_model
189+
```
190+
191+
However, there are scenarios where you want to use a general model for most of the operations but an specific one for a given task. On those cases, it's useful to set the LLM or embedding model your are using for each LlamaIndex construct. In the following example, we set an specific model:
192+
193+
```python
194+
from llama_index.core.evaluation import RelevancyEvaluator
195+
196+
relevancy_evaluator = RelevancyEvaluator(llm=llm)
197+
```
198+
199+
In general, you will use a combination of both strategies.
200+
201+
## Related content
202+
203+
* [How to get started with Azure AI SDKs](sdk-overview.md)

0 commit comments

Comments
 (0)