You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Find your connection string in the Azure AI Studio project you created in the [AI Studio playground quickstart](../quickstarts/get-started-playground.md). Open the project, then find the connection string on the **Overview** page. Copy the connection string and paste it into the `.env` file.
Copy file name to clipboardExpand all lines: articles/ai-studio/tutorials/copilot-sdk-evaluate-deploy.md
+21-26Lines changed: 21 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ description: Evaluate and deploy a custom chat app with the prompt flow SDK. Thi
5
5
manager: scottpolly
6
6
ms.service: azure-ai-studio
7
7
ms.topic: tutorial
8
-
ms.date: 10/31/2024
8
+
ms.date: 11/06/2024
9
9
ms.reviewer: lebaro
10
10
ms.author: sgilley
11
11
author: sdgilley
@@ -14,7 +14,7 @@ author: sdgilley
14
14
15
15
# Tutorial: Part 3 - Evaluate and deploy a custom chat application with the prompt flow SDK
16
16
17
-
In this tutorial, you use the Azure AI SDK (and other libraries) to evaluate and deploy the chat app you built in [Part 1 of the tutorial series](copilot-sdk-build-rag.md). In this part three, you learn how to:
17
+
In this tutorial, you use the Azure AI SDK (and other libraries) to evaluate and deploy the chat app you built in [Part 2 of the tutorial series](copilot-sdk-build-rag.md). In this part three, you learn how to:
18
18
19
19
> [!div class="checklist"]
20
20
> - Evaluate the quality of chat app responses
@@ -27,42 +27,31 @@ This tutorial is part three of a three-part tutorial.
27
27
28
28
- Complete [part 2 of the tutorial series](copilot-sdk-build-rag.md) to build the chat application.
29
29
30
-
- You must have the necessary permissions to add role assignments in your Azure subscription. Granting permissions by role assignment is only allowed by the **Owner** of the specific Azure resources. You might need to ask your Azure subscription owner (who might be your IT admin) for help with endpoint access later in the tutorial.
31
30
32
31
## <aname="evaluate"></a> Evaluate the quality of the chat app responses
33
32
34
33
Now that you know your chat app responds well to your queries, including with chat history, it's time to evaluate how it does across a few different metrics and more data.
35
34
36
-
You use the prompt flow evaluator with an evaluation dataset and the `get_chat_response()` target function, then assess the evaluation results.
35
+
You use an evaluator with an evaluation dataset and the `get_chat_response()` target function, then assess the evaluation results.
37
36
38
37
Once you run an evaluation, you can then make improvements to your logic, like improving your system prompt, and observing how the chat app responses change and improve.
39
38
40
-
### Set your evaluation model
41
-
42
-
Choose the evaluation model you want to use. It can be the same as a chat model you used to build the app. If you want a different model for evaluation, you need to deploy it, or specify it if it already exists. For example, you might be using `gpt-35-turbo` for your chat completions, but want to use `gpt-4` for evaluation since it might perform better.
43
-
44
-
Add your evaluation model name in your **.env** file:
45
-
46
-
```env
47
-
AZURE_OPENAI_EVALUATION_DEPLOYMENT=<your evaluation model deployment name>
48
-
```
49
-
50
39
### Create evaluation dataset
51
40
52
-
Use the following evaluation dataset, which contains example questions and expected answers (truth).
41
+
Use the following evaluation dataset, which contains example questions and expected answers (truth).
53
42
54
-
1. Create a file called **eval_dataset.jsonl** in your **rag-tutorial** folder. See the [application code structure](copilot-sdk-build-rag.md) for reference.
43
+
1. Create a file called **chat_eval_data.jsonl** in your **assets** folder.
The main function at the end allows you to view the evaluation result locally, and gives you a link to the evaluation results in AI Studio.
80
69
70
+
### Create helper script
71
+
72
+
The evaluation script uses a helper script to define the target function and run the evaluation. Create a file called **config.py** in your main folder. Add the following code:
1. From your console, sign in to your Azure account with the Azure CLI:
@@ -89,8 +84,7 @@ The main function at the end allows you to view the evaluation result locally, a
89
84
1. Install the required packages:
90
85
91
86
```bash
92
-
pip install promptflow-evals
93
-
pip install promptflow-azure
87
+
pip install azure_ai-evaluation[remote]
94
88
```
95
89
96
90
1. Now run the evaluation script:
@@ -99,8 +93,6 @@ The main function at the end allows you to view the evaluation result locally, a
99
93
python evaluate.py
100
94
```
101
95
102
-
For more information about using the prompt flow SDK for evaluation, see [Evaluate with the prompt flow SDK](../how-to/develop/evaluate-sdk.md).
103
-
104
96
### Interpret the evaluation output
105
97
106
98
In the console output, you see foreach question an answer and the summarized metricsin this nice table format. (You might see different columns in your output.)
@@ -148,6 +140,9 @@ For more information about evaluation results in AI Studio, see [How to view eva
148
140
149
141
Now that you verified your chat app behaves as expected, you're ready to deploy your application.
150
142
143
+
> [!NOTE]
144
+
> The rest of this tutorial is the old version, nothing else has been updated yet. Stop here for now.
145
+
151
146
## <a name="deploy"></a>Deploy the chat app to Azure
152
147
153
148
Now let's go ahead and deploy this chat app to a managed endpoint so that it can be consumed by an external application or website.
@@ -182,7 +177,7 @@ As part of creating the deployment, your **copilot_flow** folder is packaged as
182
177
> [!IMPORTANT]
183
178
> Deploying your application to a managed endpoint in Azure has associated compute cost based on the instance type you choose. Make sure you are aware of the associated cost and have quota for the instance type you specify. Learn more about [online endpoints](/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).
184
179
185
-
Create the file **deploy.py**in the **rag-tutorial** folder. Add the following code:
180
+
Create the file **deploy.py**in the main folder. Add the following code:
0 commit comments