Skip to content

Commit 2f4f15f

Browse files
committed
address PR feedback
1 parent 3433ad0 commit 2f4f15f

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

articles/ai-services/openai/how-to/evaluations.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ description: Learn how to use evaluations with Azure OpenAI
55
manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: how-to
8+
ms.custom: references_regions
89
ms.date: 11/10/2024
910
author: mrbullwinkle
1011
ms.author: mbullwin
@@ -124,13 +125,13 @@ Testing criteria is used to assess the effectiveness of each output generated by
124125

125126
You'll see the first three lines of the file as a preview:
126127

127-
:::image type="content" source="../media/how-to/evaluations/preview.png" alt-text="Screenshot that shows a preview of an uploaded evaluation file" lightbox="../media/how-to/evaluations/preview.png":::
128+
:::image type="content" source="../media/how-to/evaluations/preview.png" alt-text="Screenshot that shows a preview of an uploaded evaluation file." lightbox="../media/how-to/evaluations/preview.png":::
128129

129130
5. Select the toggle for **Generate responses**. Select `{{item.input}}` from the dropdown. This will inject the input fields from our evaluation file into individual prompts for a new model run that we want to able to compare against our evaluation dataset. The model will take that input and generate its own unique outputs which in this case will be stored in a variable called `{{sample.output_text}}`. We'll then use that sample output text later as part of our testing criteria. Alternatively you could provide your own custom system message and individual message examples manually.
130131

131132
6. Select which model you want to generate responses based on your evaluation. If you don't have a model you can create one. For the purpose of this example we're using a standard deployment of `gpt-4o-mini`.
132133

133-
:::image type="content" source="../media/how-to/evaluations/item-input.png" alt-text="Screenshot of the generate responses UX with a model selected." lightbox="../media/how-to/evaluations/item-input.png":::
134+
:::image type="content" source="../media/how-to/evaluations/item-input.png" alt-text="Screenshot of the UX for generating model responses with a model selected." lightbox="../media/how-to/evaluations/item-input.png":::
134135

135136
The settings/sprocket symbol controls the basic parameters that are passed to the model. Only the following parameters are supported at this time:
136137

@@ -144,7 +145,7 @@ Testing criteria is used to assess the effectiveness of each output generated by
144145

145146
8. Select **Semantic Similarity** > Under **Compare** add `{{item.output}}` under **With** add ``{{sample.output_text}}``. This will take the original reference output from your evaluation `.jsonl` file and compare it against the output that will be generated by giving the model prompts based on your ``{{item.input}}``.
146147

147-
:::image type="content" source="../media/how-to/evaluations/semantic-similarity-config.png" alt-text="Screenshot of the semantic similarity UX config" lightbox="../media/how-to/evaluations/semantic-similarity-config.png":::
148+
:::image type="content" source="../media/how-to/evaluations/semantic-similarity-config.png" alt-text="Screenshot of the semantic similarity UX config." lightbox="../media/how-to/evaluations/semantic-similarity-config.png":::
148149

149150
9. Select **Add** > at this point you can either add additional testing criteria or you select Create to initiate the evaluation job run.
150151

@@ -156,7 +157,7 @@ Testing criteria is used to assess the effectiveness of each output generated by
156157

157158
:::image type="content" source="../media/how-to/evaluations/test-complete.png" alt-text="Screenshot of a completed semantic similarity test with mix of pass and failures." lightbox="../media/how-to/evaluations/test-complete.png":::
158159

159-
12. For semantic similarity **View output details** contains a JSON representation that you can copy/paste of the your passing tests.
160+
12. For semantic similarity **View output details** contains a JSON representation that you can copy/paste of your passing tests.
160161

161162
:::image type="content" source="../media/how-to/evaluations/output-details.png" alt-text="Screenshot of the evaluation status UX with output details." lightbox="../media/how-to/evaluations/output-details.png":::
162163

@@ -253,13 +254,13 @@ Verifies if the output is valid JSON or XML.
253254

254255
Ensures the output follows the specified structure.
255256

256-
:::image type="content" source="../media/how-to/evaluations/matches-schema.png" alt-text="Screenshot of the matches schema testing criteria" lightbox="../media/how-to/evaluations/matches-schema.png":::
257+
:::image type="content" source="../media/how-to/evaluations/matches-schema.png" alt-text="Screenshot of the matches schema testing criteria." lightbox="../media/how-to/evaluations/matches-schema.png":::
257258

258259
### Criteria match
259260

260261
Assess if model's response matches your criteria. Grade: Pass or Fail.
261262

262-
:::image type="content" source="../media/how-to/evaluations/criteria-match.png" alt-text="Screenshot of the matches criteria test" lightbox="../media/how-to/evaluations/criteria-match.png":::
263+
:::image type="content" source="../media/how-to/evaluations/criteria-match.png" alt-text="Screenshot of the matches criteria test." lightbox="../media/how-to/evaluations/criteria-match.png":::
263264

264265
You can view the prompt text that is used as part of this testing criteria by selecting the dropdown next to the prompt. The current prompt text is:
265266

0 commit comments

Comments
 (0)