Skip to content

Commit ac6fa20

Browse files
authored
Merge pull request #284 from santiagxf/santiagxf/llamaindex-sdk
LlamaIndex integration and Inference SDK
2 parents e06abf7 + 279f6b0 commit ac6fa20

File tree

4 files changed

+208
-1
lines changed

4 files changed

+208
-1
lines changed
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
---
2+
title: Develop application with LlamaIndex and Azure AI studio
3+
titleSuffix: Azure AI Studio
4+
description: This article explains how to use LlamaIndex with models deployed in Azure AI studio to build advance intelligent applications.
5+
manager: nitinme
6+
ms.service: azure-ai-studio
7+
ms.topic: how-to
8+
ms.date: 9/14/2024
9+
ms.reviewer: fasantia
10+
ms.author: eur
11+
author: eric-urban
12+
---
13+
14+
# Develop applications with LlamaIndex and Azure AI studio
15+
16+
In this article, you learn how to use [LlamaIndex](https://github.com/run-llama/llama_index) with models deployed from the Azure AI model catalog deployed to Azure AI studio.
17+
18+
Models deployed to Azure AI studio can be used with LlamaIndex in two ways:
19+
20+
- **Using the Azure AI model inference API:** All models deployed to Azure AI studio support the [Azure AI model inference API](../../reference/reference-model-inference-api.md), which offers a common set of functionalities that can be used for most of the models in the catalog. The benefit of this API is that, since it's the same for all the models, changing from one to another is as simple as changing the model deployment being use. No further changes are required in the code. When working with LlamaIndex, install the extensions `llama-index-llms-azure-inference` and `llama-index-embeddings-azure-inference`.
21+
22+
- **Using the model's provider specific API:** Some models, like OpenAI, Cohere, or Mistral, offer their own set of APIs and extensions for LlamaIndex. Those extensions may include specific functionalities that the model support and hence are suitable if you want to exploit them. When working with `llama-index`, install the extension specific for the model you want to use, like `llama-index-llms-openai` or `llama-index-llms-cohere`.
23+
24+
In this example, we are working with the **Azure AI model inference API**.
25+
26+
## Prerequisites
27+
28+
To run this tutorial, you need:
29+
30+
1. An [Azure subscription](https://azure.microsoft.com).
31+
2. An Azure AI hub resource as explained at [How to create and manage an Azure AI Studio hub](../create-azure-ai-resource.md).
32+
3. A model supporting the [Azure AI model inference API](https://aka.ms/azureai/modelinference) deployed. In this example, we use a `Mistral-Large` deployment, but use any model of your preference. For using embeddings capabilities in LlamaIndex, you need an embedding model like `cohere-embed-v3-multilingual`.
33+
34+
* You can follow the instructions at [Deploy models as serverless APIs](../deploy-models-serverless.md).
35+
36+
4. Python 3.8 or later installed, including pip.
37+
5. LlamaIndex installed. You can do it with:
38+
39+
```bash
40+
pip install llama-index
41+
```
42+
43+
6. In this example, we are working with the Azure AI model inference API, hence we install the following packages:
44+
45+
```bash
46+
pip install -U llama-index-llms-azure-inference
47+
pip install -U llama-index-embeddings-azure-inference
48+
```
49+
50+
## Configure the environment
51+
52+
To use LLMs deployed in Azure AI studio, you need the endpoint and credentials to connect to it. The parameter `model_name` is not required for endpoints serving a single model, like Managed Online Endpoints. Follow these steps to get the information you need from the model you want to use:
53+
54+
1. Go to the [Azure AI studio](https://ai.azure.com/).
55+
2. Go to deployments and select the model you deployed as indicated in the prerequisites.
56+
3. Copy the endpoint URL and the key.
57+
58+
:::image type="content" source="../../media/how-to/inference/serverless-endpoint-url-keys.png" alt-text="Screenshot of the option to copy endpoint URI and keys from an endpoint." lightbox="../../media/how-to/inference/serverless-endpoint-url-keys.png":::
59+
60+
> [!TIP]
61+
> If your model was deployed with Microsoft Entra ID support, you don't need a key.
62+
63+
In this scenario, we placed both the endpoint URL and key in the following environment variables:
64+
65+
```bash
66+
export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
67+
export AZURE_INFERENCE_CREDENTIAL="<your-key-goes-here>"
68+
```
69+
70+
Once configured, create a client to connect to the endpoint:
71+
72+
```python
73+
import os
74+
from llama_index.llms.azure_inference import AzureAICompletionsModel
75+
76+
llm = AzureAICompletionsModel(
77+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
78+
credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
79+
)
80+
```
81+
82+
Alternatively, if your endpoint support Microsoft Entra ID, you can use the following code to create the client:
83+
84+
```python
85+
from azure.identity import DefaultAzureCredential
86+
87+
llm = AzureAICompletionsModel(
88+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
89+
credential=DefaultAzureCredential(),
90+
)
91+
```
92+
93+
> [!NOTE]
94+
> > Note: When using Microsoft Entra ID, make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
95+
96+
If you are planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials:
97+
98+
```python
99+
from azure.identity.aio import (
100+
DefaultAzureCredential as DefaultAzureCredentialAsync,
101+
)
102+
103+
llm = AzureAICompletionsModel(
104+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
105+
credential=DefaultAzureCredentialAsync(),
106+
)
107+
```
108+
109+
### Inference parameters
110+
111+
You can configure how inference in performed for all the operations that are using this client by setting extra parameters. This helps avoid indicating them on each call you make to the model.
112+
113+
```python
114+
llm = AzureAICompletionsModel(
115+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
116+
credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
117+
temperature=0.0,
118+
model_kwargs={"top_p": 1.0},
119+
)
120+
```
121+
122+
Parameters not supported in the Azure AI model inference API ([reference](../../reference/reference-model-inference-chat-completions.md)) but available in the underlying model, you can use the `model_extras` argument. In the following example, the parameter `safe_prompt`, only available for Mistral models, is being passed.
123+
124+
```python
125+
llm = AzureAICompletionsModel(
126+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
127+
credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
128+
temperature=0.0,
129+
model_kwargs={"model_extras": {"safe_prompt": True}},
130+
)
131+
```
132+
133+
## Use LLMs models
134+
135+
Use the `chat` endpoint for chat instruction models. The `complete` method is still available for model of type `chat-completions`. On those cases, your input text is converted to a message with `role="user"`.
136+
137+
```python
138+
from llama_index.core.llms import ChatMessage
139+
140+
messages = [
141+
ChatMessage(
142+
role="system", content="You are a pirate with colorful personality."
143+
),
144+
ChatMessage(role="user", content="Hello"),
145+
]
146+
147+
response = llm.chat(messages)
148+
print(response)
149+
```
150+
151+
You can stream the outputs also:
152+
153+
```python
154+
response = llm.stream_chat(messages)
155+
for r in response:
156+
print(r.delta, end="")
157+
```
158+
159+
## Use embeddings models
160+
161+
In the same way you create an LLM client, you can connect to an embedding model. In the following example, we are setting again the environment variable to now point to an embeddings model:
162+
163+
```bash
164+
export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
165+
export AZURE_INFERENCE_CREDENTIAL="<your-key-goes-here>"
166+
```
167+
168+
Then create the client:
169+
170+
```python
171+
from llama_index.embeddings.azure_inference import AzureAIEmbeddingsModel
172+
173+
embed_model = AzureAIEmbeddingsModel(
174+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
175+
credential=os.environ['AZURE_INFERENCE_CREDENTIAL'],
176+
)
177+
```
178+
179+
## Configure the models used by your code
180+
181+
You can use the LLM or embeddings model client individually in the code you develop with LlamaIndex or you can configure the entire session using the `Settings` options. Configuring the session has the advantage of all your code using the same models for all the operations.
182+
183+
```python
184+
from llama_index.core import Settings
185+
186+
Settings.llm = llm
187+
Settings.embed_model = embed_model
188+
```
189+
190+
However, there are scenarios where you want to use a general model for most of the operations but a specific one for a given task. On those cases, it's useful to set the LLM or embedding model you are using for each LlamaIndex construct. In the following example, we set a specific model:
191+
192+
```python
193+
from llama_index.core.evaluation import RelevancyEvaluator
194+
195+
relevancy_evaluator = RelevancyEvaluator(llm=llm)
196+
```
197+
198+
In general, you use a combination of both strategies.
199+
200+
## Related content
201+
202+
* [How to get started with Azure AI SDKs](sdk-overview.md)

articles/ai-studio/how-to/develop/sdk-overview.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ author: eric-urban
1515

1616
# Overview of the Azure AI SDKs
1717

18-
Microsoft offers a variety of packages that you can use for building generative AI applications in the cloud. In most applications, you need to use a combination of packages to manage and use various Azure services that provide AI functionality. We also offer integrations with open-source libraries like LangChain and mlflow for use with Azure. In this article we'll give an overview of the main services and SDKs you can use with Azure AI Studio.
18+
Microsoft offers a variety of packages that you can use for building generative AI applications in the cloud. In most applications, you need to use a combination of packages to manage and use various Azure services that provide AI functionality. We also offer integrations with open-source libraries like LangChain and MLflow for use with Azure. In this article we'll give an overview of the main services and SDKs you can use with Azure AI Studio.
1919

2020
For building generative AI applications, we recommend using the following services and SDKs:
2121
* [Azure Machine Learning](/azure/machine-learning/overview-what-is-azure-machine-learning) for the hub and project infrastructure used in AI Studio to organize your work into projects, manage project artifacts (data, evaluation runs, traces), fine-tune & deploy models, and connect to external services and resources.
@@ -54,6 +54,9 @@ Azure AI services
5454
Prompt flow
5555
* [Prompt flow SDK](https://microsoft.github.io/promptflow/how-to-guides/quick-start.html)
5656

57+
Agentic frameworks:
58+
* [LlamaIndex](llama-index.md)
59+
5760
## Related content
5861

5962
- [Get started building a chat app using the prompt flow SDK](../../quickstarts/get-started-code.md)
710 KB
Loading

articles/ai-studio/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -252,6 +252,8 @@ items:
252252
href: how-to/develop/vscode.md
253253
- name: Start with an AI template
254254
href: how-to/develop/ai-template-get-started.md
255+
- name: Develop with LlamaIndex and Azure AI studio
256+
href: how-to/develop/llama-index.md
255257
- name: Trace your application with prompt flow
256258
href: how-to/develop/trace-local-sdk.md
257259
displayName: code,sdk

0 commit comments

Comments
 (0)