Skip to content

Commit ed8d297

Browse files
committed
articles-about-evaluating-apps-with-Azure-AI-Evaluation-SDK
1 parent fc4a5de commit ed8d297

File tree

7 files changed

+390
-379
lines changed

7 files changed

+390
-379
lines changed

articles/ai-foundry/how-to/develop/agent-evaluate-sdk.md

Lines changed: 65 additions & 59 deletions
Large diffs are not rendered by default.

articles/ai-foundry/how-to/develop/cloud-evaluation.md

Lines changed: 32 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: Cloud evaluation with Azure AI Foundry SDK
2+
title: Cloud Evaluation with the Azure AI Foundry SDK
33
titleSuffix: Azure AI Foundry
4-
description: This article provides instructions on how to evaluate a Generative AI application on the cloud.
4+
description: This article provides instructions on how to evaluate a generative AI application in the cloud.
55
manager: scottpolly
66
ms.service: azure-ai-foundry
77
ms.custom:
@@ -13,54 +13,53 @@ ms.reviewer: changliu2
1313
ms.author: lagayhar
1414
author: lgayhardt
1515
---
16-
# Run evaluations in the cloud using Azure AI Foundry SDK (preview)
16+
# Run evaluations in the cloud by using the Azure AI Foundry SDK (preview)
1717

1818
[!INCLUDE [feature-preview](../../includes/feature-preview.md)]
1919

20-
While Azure AI Evaluation SDK supports running evaluations locally on your own machine, you might want to delegate the job remotely to the cloud. For example, after you ran local evaluations on small test data to help assess your generative AI application prototypes, now you move into pre-deployment testing and need run evaluations on a large dataset. Cloud evaluation frees you from managing your local compute infrastructure, and enables you to integrate evaluations as tests into your CI/CD pipelines. After deployment, you might want to [continuously evaluate](../online-evaluation.md) your applications for post-deployment monitoring.
20+
The Azure AI Evaluation SDK supports running evaluations locally on your own machine and also in the cloud. For example, after you ran local evaluations on small test data to help assess your generative AI application prototypes, you can move into pre-deployment testing and run evaluations on a large dataset. Evaluating your applications in the cloud frees you from managing your local compute infrastructure, and enables you to integrate evaluations as tests into your continuous integration and continuous delivery (CI/CD) pipelines. After deployment, you can choose to [continuously evaluate](../online-evaluation.md) your applications for post-deployment monitoring.
2121

22-
In this article, you learn how to run evaluations in the cloud (preview) in pre-deployment testing on a test dataset. Using the Azure AI Projects SDK, you'll have evaluation results automatically logged into your Azure AI project for better observability. This feature supports all Microsoft curated [built-in evaluators](../../concepts/observability.md#what-are-evaluators) and your own [custom evaluators](../../concepts/evaluation-evaluators/custom-evaluators.md) which can be located in the [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) and have the same project-scope RBAC.
22+
In this article, you learn how to run evaluations in the cloud (preview) in pre-deployment testing on a test dataset. When you use the Azure AI Projects SDK, evaluation results are automatically logged into your Azure AI project for better observability. This feature supports all Microsoft curated [built-in evaluators](../../concepts/observability.md#what-are-evaluators) and your own [custom evaluators](../../concepts/evaluation-evaluators/custom-evaluators.md). Your evaluators can be located in the [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) and have the same project-scope role-based access control (RBAC).
2323

24-
## Prerequisite set up steps for Azure AI Foundry Projects
24+
## Prerequisites
2525

26-
- Azure AI Foundry project in the same supported [regions](../../concepts/evaluation-evaluators/risk-safety-evaluators.md#azure-ai-foundry-project-configuration-and-region-support) as risk and safety evaluators (preview). If you don't have an existing project, follow the guide [How to create Azure AI Foundry project](../create-projects.md?tabs=ai-studio) to create one.
27-
28-
- Azure OpenAI Deployment with GPT model supporting `chat completion`, for example `gpt-4`.
26+
- Azure AI Foundry project in the same supported [regions](../../concepts/evaluation-evaluators/risk-safety-evaluators.md#azure-ai-foundry-project-configuration-and-region-support) as risk and safety evaluators (preview). If you don't have an existing project, create one by following the guide [How to create Azure AI Foundry project](../create-projects.md?tabs=ai-studio).
27+
- Azure OpenAI Deployment with GPT model supporting `chat completion`. For example, `gpt-4`.
2928
- Make sure you're first logged into your Azure subscription by running `az login`.
3029

31-
If this is your first time running evaluations and logging it to your Azure AI Foundry project, you might need to do a few additional setup steps.
30+
If this is your first time running evaluations and logging it to your Azure AI Foundry project, you might need to do a few additional steps.
3231

33-
1. [Create and connect your storage account](https://github.com/azure-ai-foundry/foundry-samples/blob/main/samples/microsoft/infrastructure-setup/01-connections/connection-storage-account.bicep) to your Azure AI Foundry project at the resource level. This bicep template provisions and connects a storage account to your Foundry project with key authentication.
32+
1. [Create and connect your storage account](https://github.com/azure-ai-foundry/foundry-samples/blob/main/samples/microsoft/infrastructure-setup/01-connections/connection-storage-account.bicep) to your Azure AI Foundry project at the resource level. The [bicep template](https://github.com/azure-ai-foundry/foundry-samples/blob/main/samples/microsoft/infrastructure-setup/01-connections/connection-storage-account.bicep) provisions and connects a storage account to your Foundry project by using key authentication.
3433
2. Make sure the connected storage account has access to all projects.
35-
3. If you connected your storage account with Microsoft Entra ID, make sure to give MSI (Microsoft Identity) permissions for Storage Blob Data Owner to both your account and Foundry project resource in Azure portal.
34+
3. If you connected your storage account with Microsoft Entra ID, make sure to give MSI (Microsoft Identity) **Storage Blob Data Owner** permissions to both your account and the Foundry project resource in the Azure portal.
3635

37-
### Getting started
36+
### Get started
3837

39-
First, install Azure AI Foundry SDK's project client which runs the evaluations in the cloud
38+
1. Install the Azure AI Foundry SDK project client that runs the evaluations in the cloud.
4039

4140
```python
4241
uv install azure-ai-projects azure-identity
4342
```
4443

4544
> [!NOTE]
46-
> For more detailed information, see the [REST API Reference Documentation](/rest/api/aifoundry/aiprojects/evaluations).
47-
Then, set your environment variables for your Azure AI Foundry resources
45+
> For more detailed information, see [REST API Reference Documentation](/rest/api/aifoundry/aiprojects/evaluations).
46+
47+
1. Set your environment variables for your Azure AI Foundry resources
4848

4949
```python
5050
import os
5151

5252
# Required environment variables
5353
endpoint = os.environ["PROJECT_ENDPOINT"] # https://<account>.services.ai.azure.com/api/projects/<project>
5454
model_endpoint = os.environ["MODEL_ENDPOINT"] # https://<account>.services.ai.azure.com
55-
model_api_key = os.environ["MODEL_API_KEY"]
56-
model_deployment_name = os.environ["MODEL_DEPLOYMENT_NAME"] # e.g. gpt-4o-mini
55+
model_api_key = os.environ["MODEL_API_KEY"]
5756

5857
# Optional – reuse an existing dataset
5958
dataset_name = os.environ.get("DATASET_NAME", "dataset-test")
6059
dataset_version = os.environ.get("DATASET_VERSION", "1.0")
6160
```
6261

63-
Now you can define a client which is used to run your evaluations in the cloud:
62+
Now, you can define a client that runs your evaluations in the cloud:
6463

6564
```python
6665
import os
@@ -74,22 +73,22 @@ project_client = AIProjectClient(
7473
)
7574
```
7675

77-
## Uploading evaluation data
76+
## Upload evaluation data
7877

7978
```python
80-
# Upload a local jsonl file (skip if you already have a Dataset registered)
79+
# Upload a local JSONL file (skip if you already have a Dataset registered)
8180
data_id = project_client.datasets.upload_file(
8281
name=dataset_name,
8382
version=dataset_version,
8483
file_path="./evaluate_test_data.jsonl",
8584
).id
8685
```
8786

88-
To learn more about input data formats for evaluating GenAI applications, see [single-turn data](./evaluate-sdk.md#single-turn-support-for-text), [conversation data](./evaluate-sdk.md#conversation-support-for-text), and [conversation data for images and multi-modalities](./evaluate-sdk.md#conversation-support-for-images-and-multi-modal-text-and-image).
87+
To learn more about input data formats for evaluating generative AI applications, see [Single-turn data](./evaluate-sdk.md#single-turn-support-for-text), [Conversation data](./evaluate-sdk.md#conversation-support-for-text), and [Conversation data for images and multi-modalities](./evaluate-sdk.md#conversation-support-for-images-and-multi-modal-text-and-image).
8988

90-
To learn more about input data formats for evaluating agents, see [evaluating Azure AI agents](./agent-evaluate-sdk.md#evaluate-azure-ai-agents) and [evaluating other agents](./agent-evaluate-sdk.md#evaluating-other-agents).
89+
To learn more about input data formats for evaluating agents, see [Evaluating Azure AI agents](./agent-evaluate-sdk.md#evaluate-azure-ai-agents) and [Evaluating other agents](./agent-evaluate-sdk.md#evaluating-other-agents).
9190

92-
## Specifying evaluators
91+
## Specify evaluators
9392

9493
```python
9594
from azure.ai.projects.models import (
@@ -117,7 +116,7 @@ evaluators = {
117116
}
118117
```
119118

120-
## Submit evaluation in the cloud
119+
## Submit an evaluation in the cloud
121120

122121
Finally, submit the remote evaluation run:
123122

@@ -148,10 +147,10 @@ print("Created evaluation:", evaluation_response.name)
148147
print("Status:", evaluation_response.status)
149148
```
150149

151-
## Specifying custom evaluators
150+
## Specify custom evaluators
152151

153152
> [!NOTE]
154-
> Azure AI Foundry Projects aren't supported for this feature. Use an Azure AI Hub Project instead.
153+
> Azure AI Foundry projects aren't supported for this feature. Use an Azure AI Foundry hub project instead.
155154
156155
### Code-based custom evaluators
157156

@@ -194,7 +193,7 @@ versioned_evaluator = ml_client.evaluators.get(evaluator_name, version=1)
194193
print("Versioned evaluator id:", registered_evaluator.id)
195194
```
196195

197-
After registering your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab in your Azure AI project.
196+
After you register your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under the **Evaluation** tab in your Azure AI project.
198197

199198
### Prompt-based custom evaluators
200199

@@ -242,13 +241,13 @@ versioned_evaluator = ml_client.evaluators.get(evaluator_name, version=1)
242241
print("Versioned evaluator id:", registered_evaluator.id)
243242
```
244243

245-
After logging your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab of your Azure AI project.
244+
After you log your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under the **Evaluation** tab of your Azure AI project.
246245

247246
## Related content
248247

249-
- [Evaluate your Generative AI applications locally](./evaluate-sdk.md)
250-
- [Evaluate your Generative AI applications online](https://aka.ms/GenAIMonitoringDoc)
248+
- [Evaluate your generative AI applications locally](./evaluate-sdk.md)
249+
- [Evaluate your generative AI applications online](https://aka.ms/GenAIMonitoringDoc)
251250
- [Learn more about simulating test datasets for evaluation](./simulator-interaction-data.md)
252-
- [View your evaluation results in Azure AI project](../../how-to/evaluate-results.md)
253-
- [Get started building a chat app using the Azure AI Foundry SDK](../../quickstarts/get-started-code.md)
251+
- [View your evaluation results in an Azure AI project](../../how-to/evaluate-results.md)
252+
- [Get started building a chat app by using the Azure AI Foundry SDK](../../quickstarts/get-started-code.md)
254253
- [Get started with evaluation samples](https://aka.ms/aistudio/eval-samples)

0 commit comments

Comments
 (0)