Skip to content

Commit 1dade04

Browse files
authored
break local and cloud eval into two docs
1 parent c4d6fdd commit 1dade04

File tree

1 file changed

+286
-0
lines changed

1 file changed

+286
-0
lines changed
Lines changed: 286 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
---
2+
title: Evaluate your Generative AI application remotely on the Cloud
3+
titleSuffix: Azure AI project
4+
description: This article provides instructions on how to evaluate a Generative AI application on the cloud.
5+
manager: scottpolly
6+
ms.service: azure-ai-studio
7+
ms.custom:
8+
- build-2024
9+
- references_regions
10+
- ignite-2024
11+
ms.topic: how-to
12+
ms.date: 12/18/2024
13+
ms.reviewer: changliu2
14+
ms.author: lagayhar
15+
author: lgayhardt
16+
---
17+
# Cloud Evaluation: Evaluate your Generative AI application remotely on the cloud
18+
19+
[!INCLUDE [feature-preview](../../includes/feature-preview.md)]
20+
21+
While Azure AI Evaluation client SDK supports running evaluations locally on your own machine, you may want to delegate the job remotely to the cloud. For example, after you ran local evaluations on small test data to help assess your generative AI application prototypes, now you move into pre-deployment testing and need run evaluations on a large dataset. Cloud evaluation frees you from managing your local compute infrastructure, and enables you integrate evaluations as tests into your CI/CD pipelines. After deployment, you may want to [continuously evaluate](https://aka.ms/GenAIMonitoringDoc) your applications for post-deployment monitoring.
22+
23+
In this article, you learn how to run cloud evaluation in pre-deployment testing on a test dataset. Using the Azure AI Projects SDK, you will have evaluation results automatically logged into your Azure AI project for better observability. This feature support all Microsft-curated [built-in evaluators](./evaluate-sdk.md#built-in-evaluators) and your own [custom evaluators](./evaluate-sdk.md#custom-evaluators) which can be located in the [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) of your project.
24+
25+
26+
### Prerequisites
27+
28+
- Azure AI project in the same [regions](#region-support) as risk and safety evaluators (preview). If you don't have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one.
29+
30+
- Azure OpenAI Deployment with GPT model supporting `chat completion`, for example `gpt-4`.
31+
- `Connection String` for Azure AI project to easily create `AIProjectClient` object. You can get the **Project connection string** under **Project details** from the project's **Overview** page.
32+
- Make sure you're first logged into your Azure subscription by running `az login`.
33+
34+
### Installation Instructions
35+
36+
1. Create a **virtual Python environment of you choice**. To create one using conda, run the following command:
37+
38+
```bash
39+
conda create -n cloud-evaluation
40+
conda activate cloud-evaluation
41+
```
42+
43+
2. Install the required packages by running the following command:
44+
45+
```bash
46+
pip install azure-identity azure-ai-projects azure-ai-ml
47+
```
48+
49+
Optionally you can `pip install azure-ai-evaluation` if you want a code-first experience to fetch evaluator ID for built-in evaluators in code.
50+
51+
Now you can define a client and a deployment which will be used to run your evaluations in the cloud:
52+
53+
```python
54+
55+
import os, time
56+
from azure.ai.projects import AIProjectClient
57+
from azure.identity import DefaultAzureCredential
58+
from azure.ai.projects.models import Evaluation, Dataset, EvaluatorConfiguration, ConnectionType
59+
from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, ViolenceEvaluator
60+
61+
# Load your Azure OpenAI config
62+
deployment_name = os.environ.get("AZURE_OPENAI_DEPLOYMENT")
63+
api_version = os.environ.get("AZURE_OPENAI_API_VERSION")
64+
65+
# Create an Azure AI Client from a connection string. Avaiable on Azure AI project Overview page.
66+
project_client = AIProjectClient.from_connection_string(
67+
credential=DefaultAzureCredential(),
68+
conn_str="<connection_string>"
69+
)
70+
```
71+
72+
### Uploading evaluation data
73+
74+
We provide two ways to register your data in Azure AI project required for evaluations in the cloud:
75+
76+
1. **From SDK**: Upload new data from your local directory to your Azure AI project in the SDK, and fetch the dataset ID as a result:
77+
78+
```python
79+
data_id, _ = project_client.upload_file("./evaluate_test_data.jsonl")
80+
```
81+
82+
**From UI**: Alternatively, you can upload new data or update existing data versions by following the UI walkthrough under the **Data** tab of your Azure AI project.
83+
84+
2. Given existing datasets uploaded to your Project:
85+
86+
- **From SDK**: if you already know the dataset name you created, construct the dataset ID in this format: `/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<project-name>/data/<dataset-name>/versions/<version-number>`
87+
88+
- **From UI**: If you don't know the dataset name, locate it under the **Data** tab of your Azure AI project and construct the dataset ID as in the format above.
89+
90+
### Specifying evaluators from Evaluator library
91+
92+
We provide a list of built-in evaluators registered in the [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab of your Azure AI project. You can also register custom evaluators and use them for Cloud evaluation. We provide two ways to specify registered evaluators:
93+
94+
#### Specifying built-in evaluators
95+
96+
- **From SDK**: Use built-in evaluator `id` property supported by `azure-ai-evaluation` SDK:
97+
98+
```python
99+
from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, ViolenceEvaluator
100+
print("F1 Score evaluator id:", F1ScoreEvaluator.id)
101+
```
102+
103+
- **From UI**: Follows these steps to fetch evaluator ids after they're registered to your project:
104+
- Select **Evaluation** tab in your Azure AI project;
105+
- Select Evaluator library;
106+
- Select your evaluators of choice by comparing the descriptions;
107+
- Copy its "Asset ID" which will be your evaluator id, for example, `azureml://registries/azureml/models/Groundedness-Evaluator/versions/1`.
108+
109+
#### Specifying custom evaluators
110+
111+
- For code-based custom evaluators, register them to your Azure AI project and fetch the evaluator ids as in this example:
112+
113+
```python
114+
from azure.ai.ml import MLClient
115+
from azure.ai.ml.entities import Model
116+
from promptflow.client import PFClient
117+
118+
119+
# Define ml_client to register custom evaluator
120+
ml_client = MLClient(
121+
subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
122+
resource_group_name=os.environ["AZURE_RESOURCE_GROUP"],
123+
workspace_name=os.environ["AZURE_PROJECT_NAME"],
124+
credential=DefaultAzureCredential()
125+
)
126+
127+
128+
# Load evaluator from module
129+
from answer_len.answer_length import AnswerLengthEvaluator
130+
131+
# Then we convert it to evaluation flow and save it locally
132+
pf_client = PFClient()
133+
local_path = "answer_len_local"
134+
pf_client.flows.save(entry=AnswerLengthEvaluator, path=local_path)
135+
136+
# Specify evaluator name to appear in the Evaluator library
137+
evaluator_name = "AnswerLenEvaluator"
138+
139+
# Finally register the evaluator to the Evaluator library
140+
custom_evaluator = Model(
141+
path=local_path,
142+
name=evaluator_name,
143+
description="Evaluator calculating answer length.",
144+
)
145+
registered_evaluator = ml_client.evaluators.create_or_update(custom_evaluator)
146+
print("Registered evaluator id:", registered_evaluator.id)
147+
# Registered evaluators have versioning. You can always reference any version available.
148+
versioned_evaluator = ml_client.evaluators.get(evaluator_name, version=1)
149+
print("Versioned evaluator id:", registered_evaluator.id)
150+
```
151+
152+
After registering your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab in your Azure AI project.
153+
154+
- For prompt-based custom evaluators, use this snippet to register them. For example, let's register our `FriendlinessEvaluator` built as described in [Prompt-based evaluators](./evaluate-sdk.md#prompt-based-evaluators):
155+
156+
```python
157+
# Import your prompt-based custom evaluator
158+
from friendliness.friend import FriendlinessEvaluator
159+
160+
# Define your deployment
161+
model_config = dict(
162+
azure_endpoint=os.environ.get("AZURE_ENDPOINT"),
163+
azure_deployment=os.environ.get("AZURE_DEPLOYMENT_NAME"),
164+
api_version=os.environ.get("AZURE_API_VERSION"),
165+
api_key=os.environ.get("AZURE_API_KEY"),
166+
type="azure_openai"
167+
)
168+
169+
# Define ml_client to register custom evaluator
170+
ml_client = MLClient(
171+
subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
172+
resource_group_name=os.environ["AZURE_RESOURCE_GROUP"],
173+
workspace_name=os.environ["AZURE_PROJECT_NAME"],
174+
credential=DefaultAzureCredential()
175+
)
176+
177+
# # Convert evaluator to evaluation flow and save it locally
178+
local_path = "friendliness_local"
179+
pf_client = PFClient()
180+
pf_client.flows.save(entry=FriendlinessEvaluator, path=local_path)
181+
182+
# Specify evaluator name to appear in the Evaluator library
183+
evaluator_name = "FriendlinessEvaluator"
184+
185+
# Register the evaluator to the Evaluator library
186+
custom_evaluator = Model(
187+
path=local_path,
188+
name=evaluator_name,
189+
description="prompt-based evaluator measuring response friendliness.",
190+
)
191+
registered_evaluator = ml_client.evaluators.create_or_update(custom_evaluator)
192+
print("Registered evaluator id:", registered_evaluator.id)
193+
# Registered evaluators have versioning. You can always reference any version available.
194+
versioned_evaluator = ml_client.evaluators.get(evaluator_name, version=1)
195+
print("Versioned evaluator id:", registered_evaluator.id)
196+
```
197+
198+
After logging your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab of your Azure AI project.
199+
200+
### Cloud evaluation (preview) with Azure AI Projects SDK
201+
202+
Given the steps above, you can now submit a cloud evaluation with Azure AI Projects SDK via a Python API. See the following example specifying an NLP evaluator (F1 score), an AI-assisted quality evaluator (Relevance), a safety evaluator (Violence) and a custom evaluator (Friendliness) with their [evaluator ids](#specifying-evaluators-from-evaluator-library):
203+
204+
```python
205+
import os, time
206+
from azure.ai.projects import AIProjectClient
207+
from azure.identity import DefaultAzureCredential
208+
from azure.ai.projects.models import Evaluation, Dataset, EvaluatorConfiguration, ConnectionType
209+
from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, ViolenceEvaluator
210+
211+
# Load your Azure OpenAI config
212+
deployment_name = os.environ.get("AZURE_OPENAI_DEPLOYMENT")
213+
api_version = os.environ.get("AZURE_OPENAI_API_VERSION")
214+
215+
# Create an Azure AI Client from a connection string. Avaiable on project overview page on Azure AI project UI.
216+
project_client = AIProjectClient.from_connection_string(
217+
credential=DefaultAzureCredential(),
218+
conn_str="<connection_string>"
219+
)
220+
221+
# Construct dataset ID per the instruction
222+
data_id = "<dataset-id>"
223+
224+
default_connection = project_client.connections.get_default(connection_type=ConnectionType.AZURE_OPEN_AI)
225+
226+
# Use the same model_config for your evaluator (or use different ones if needed)
227+
model_config = default_connection.to_evaluator_model_config(deployment_name=deployment_name, api_version=api_version)
228+
229+
# Create an evaluation
230+
evaluation = Evaluation(
231+
display_name="Cloud evaluation",
232+
description="Evaluation of dataset",
233+
data=Dataset(id=data_id),
234+
evaluators={
235+
# Note the evaluator configuration key must follow a naming convention
236+
# the string must start with a letter with only alphanumeric characters
237+
# and underscores. Take "f1_score" as example: "f1score" or "f1_evaluator"
238+
# will also be acceptable, but "f1-score-eval" or "1score" will result in errors.
239+
"f1_score": EvaluatorConfiguration(
240+
id=F1ScoreEvaluator.id,
241+
),
242+
"relevance": EvaluatorConfiguration(
243+
id=RelevanceEvaluator.id,
244+
init_params={
245+
"model_config": model_config
246+
},
247+
),
248+
"violence": EvaluatorConfiguration(
249+
id=ViolenceEvaluator.id,
250+
init_params={
251+
"azure_ai_project": project_client.scope
252+
},
253+
),
254+
"friendliness": EvaluatorConfiguration(
255+
id="<custom_evaluator_id>",
256+
init_params={
257+
"model_config": model_config
258+
}
259+
)
260+
},
261+
)
262+
263+
# Create evaluation
264+
evaluation_response = project_client.evaluations.create(
265+
evaluation=evaluation,
266+
)
267+
268+
# Get evaluation
269+
get_evaluation_response = project_client.evaluations.get(evaluation_response.id)
270+
271+
print("----------------------------------------------------------------")
272+
print("Created evaluation, evaluation ID: ", get_evaluation_response.id)
273+
print("Evaluation status: ", get_evaluation_response.status)
274+
print("AI project URI: ", get_evaluation_response.properties["AiStudioEvaluationUri"])
275+
print("----------------------------------------------------------------")
276+
```
277+
Now you can use the URI to view your evaluation results in your Azure AI project, in order to better assess the quality and safety performance of your applications.
278+
279+
## Related content
280+
281+
- [Evaluate your Generative AI applications locally](./evaluate-sdk.md)
282+
- [Evaluate your Generative AI applications online](https://aka.ms/GenAIMonitoringDoc)
283+
- [Learn more about simulating test datasets for evaluation](./simulator-interaction-data.md)
284+
- [View your evaluation results in Azure AI project](../../how-to/evaluate-results.md)
285+
- [Get started building a chat app using the Azure AI Foundry SDK](../../quickstarts/get-started-code.md)
286+
- [Get started with evaluation samples](https://aka.ms/aistudio/eval-samples)

0 commit comments

Comments
 (0)