Skip to content

Commit c344e8c

Browse files
Merge pull request #1361 from sdgilley/sdg-release-update-code-qs-tutorial
switch to new evaluate code
2 parents ae0fe11 + 9aeaa95 commit c344e8c

File tree

2 files changed

+22
-26
lines changed

2 files changed

+22
-26
lines changed

articles/ai-studio/includes/create-env-file-tutorial.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ AISEARCH_INDEX_NAME="example-index"
2020
EMBEDDINGS_MODEL="text-embedding-ada-002"
2121
INTENT_MAPPING_MODEL="gpt-4o-mini"
2222
CHAT_MODEL="gpt-4o-mini"
23+
EVALUATION_MODEL="gpt-4o-mini"
2324
```
2425

2526
Find your connection string in the Azure AI Studio project you created in the [AI Studio playground quickstart](../quickstarts/get-started-playground.md). Open the project, then find the connection string on the **Overview** page. Copy the connection string and paste it into the `.env` file.

articles/ai-studio/tutorials/copilot-sdk-evaluate-deploy.md

Lines changed: 21 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: Evaluate and deploy a custom chat app with the prompt flow SDK. Thi
55
manager: scottpolly
66
ms.service: azure-ai-studio
77
ms.topic: tutorial
8-
ms.date: 10/31/2024
8+
ms.date: 11/06/2024
99
ms.reviewer: lebaro
1010
ms.author: sgilley
1111
author: sdgilley
@@ -14,7 +14,7 @@ author: sdgilley
1414

1515
# Tutorial: Part 3 - Evaluate and deploy a custom chat application with the prompt flow SDK
1616

17-
In this tutorial, you use the Azure AI SDK (and other libraries) to evaluate and deploy the chat app you built in [Part 1 of the tutorial series](copilot-sdk-build-rag.md). In this part three, you learn how to:
17+
In this tutorial, you use the Azure AI SDK (and other libraries) to evaluate and deploy the chat app you built in [Part 2 of the tutorial series](copilot-sdk-build-rag.md). In this part three, you learn how to:
1818

1919
> [!div class="checklist"]
2020
> - Evaluate the quality of chat app responses
@@ -27,42 +27,31 @@ This tutorial is part three of a three-part tutorial.
2727

2828
- Complete [part 2 of the tutorial series](copilot-sdk-build-rag.md) to build the chat application.
2929

30-
- You must have the necessary permissions to add role assignments in your Azure subscription. Granting permissions by role assignment is only allowed by the **Owner** of the specific Azure resources. You might need to ask your Azure subscription owner (who might be your IT admin) for help with endpoint access later in the tutorial.
3130

3231
## <a name="evaluate"></a> Evaluate the quality of the chat app responses
3332

3433
Now that you know your chat app responds well to your queries, including with chat history, it's time to evaluate how it does across a few different metrics and more data.
3534

36-
You use the prompt flow evaluator with an evaluation dataset and the `get_chat_response()` target function, then assess the evaluation results.
35+
You use an evaluator with an evaluation dataset and the `get_chat_response()` target function, then assess the evaluation results.
3736

3837
Once you run an evaluation, you can then make improvements to your logic, like improving your system prompt, and observing how the chat app responses change and improve.
3938

40-
### Set your evaluation model
41-
42-
Choose the evaluation model you want to use. It can be the same as a chat model you used to build the app. If you want a different model for evaluation, you need to deploy it, or specify it if it already exists. For example, you might be using `gpt-35-turbo` for your chat completions, but want to use `gpt-4` for evaluation since it might perform better.
43-
44-
Add your evaluation model name in your **.env** file:
45-
46-
```env
47-
AZURE_OPENAI_EVALUATION_DEPLOYMENT=<your evaluation model deployment name>
48-
```
49-
5039
### Create evaluation dataset
5140

52-
Use the following evaluation dataset, which contains example questions and expected answers (truth).
41+
Use the following evaluation dataset, which contains example questions and expected answers (truth).
5342

54-
1. Create a file called **eval_dataset.jsonl** in your **rag-tutorial** folder. See the [application code structure](copilot-sdk-build-rag.md) for reference.
43+
1. Create a file called **chat_eval_data.jsonl** in your **assets** folder.
5544
1. Paste this dataset into the file:
5645

57-
:::code language="jsonl" source="~/rag-data-openai-python-promptflow-main/tutorial/eval_dataset.jsonl":::
46+
:::code language="jsonl" source="~/azureai-samples-nov2024/scenarios/rag/custom-rag-app/assets/chat_eval_data.jsonl":::
5847

59-
### Evaluate with prompt flow evaluators
48+
### Evaluate with Azure AI evaluators
6049

6150
Now define an evaluation script that will:
6251

63-
- Import the `evaluate` function and evaluators from the Prompt flow `evals` package.
64-
- Load the sample `.jsonl` dataset.
52+
6553
- Generate a target function wrapper around our chat app logic.
54+
- Load the sample `.jsonl` dataset.
6655
- Run the evaluation, which takes the target function, and merges the evaluation dataset with the responses from the chat app.
6756
- Generate a set of GPT-assisted metrics (relevance, groundedness, and coherence) to evaluate the quality of the chat app responses.
6857
- Output the results locally, and logs the results to the cloud project.
@@ -74,10 +63,16 @@ The script also logs the evaluation results to the cloud project so that you can
7463
1. Create a file called **evaluate.py** in your **rag-tutorial** folder.
7564
1. Add the following code. Update the `dataset_path` and `evaluation_name` to fit your use case.
7665

77-
:::code language="python" source="~/rag-data-openai-python-promptflow-main/tutorial/evaluate.py":::
66+
:::code language="python" source="~/azureai-samples-nov2024/scenarios/rag/custom-rag-app/evaluate.py":::
7867

7968
The main function at the end allows you to view the evaluation result locally, and gives you a link to the evaluation results in AI Studio.
8069

70+
### Create helper script
71+
72+
The evaluation script uses a helper script to define the target function and run the evaluation. Create a file called **config.py** in your main folder. Add the following code:
73+
74+
:::code language="python" source="~/azureai-samples-nov2024/scenarios/rag/custom-rag-app/config.py":::
75+
8176
### Run the evaluation script
8277

8378
1. From your console, sign in to your Azure account with the Azure CLI:
@@ -89,8 +84,7 @@ The main function at the end allows you to view the evaluation result locally, a
8984
1. Install the required packages:
9085

9186
```bash
92-
pip install promptflow-evals
93-
pip install promptflow-azure
87+
pip install azure_ai-evaluation[remote]
9488
```
9589

9690
1. Now run the evaluation script:
@@ -99,8 +93,6 @@ The main function at the end allows you to view the evaluation result locally, a
9993
python evaluate.py
10094
```
10195

102-
For more information about using the prompt flow SDK for evaluation, see [Evaluate with the prompt flow SDK](../how-to/develop/evaluate-sdk.md).
103-
10496
### Interpret the evaluation output
10597

10698
In the console output, you see for each question an answer and the summarized metrics in this nice table format. (You might see different columns in your output.)
@@ -148,6 +140,9 @@ For more information about evaluation results in AI Studio, see [How to view eva
148140
149141
Now that you verified your chat app behaves as expected, you're ready to deploy your application.
150142

143+
> [!NOTE]
144+
> The rest of this tutorial is the old version, nothing else has been updated yet. Stop here for now.
145+
151146
## <a name="deploy"></a>Deploy the chat app to Azure
152147

153148
Now let's go ahead and deploy this chat app to a managed endpoint so that it can be consumed by an external application or website.
@@ -182,7 +177,7 @@ As part of creating the deployment, your **copilot_flow** folder is packaged as
182177
> [!IMPORTANT]
183178
> Deploying your application to a managed endpoint in Azure has associated compute cost based on the instance type you choose. Make sure you are aware of the associated cost and have quota for the instance type you specify. Learn more about [online endpoints](/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).
184179

185-
Create the file **deploy.py** in the **rag-tutorial** folder. Add the following code:
180+
Create the file **deploy.py** in the main folder. Add the following code:
186181

187182
:::code language="python" source="~/rag-data-openai-python-promptflow-main/tutorial/deploy.py" id="deploy":::
188183

0 commit comments

Comments
 (0)