You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/evaluations.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -146,11 +146,11 @@ When you click into each testing criteria, you will see different types of grade
146
146
147
147
4. Select your evaluation data which will be in `.jsonl` format. If you already have an existing data, you can select one, or upload a new data.
148
148
149
-
:::image type="content" source="../media/how-to/evaluations/upload-data-1.png" alt-text="Screenshot of data upload" lightbox="../media/how-to/evaluations/upload-data-1.png":::
149
+
:::image type="content" source="../media/how-to/evaluations/upload-data-1.png" alt-text="Screenshot of data upload." lightbox="../media/how-to/evaluations/upload-data-1.png":::
150
150
151
151
When you upload new data, you'll see the first three lines of the file as a preview on the right side:
152
152
153
-
:::image type="content" source="../media/how-to/evaluations/upload-data-2.png" alt-text="Screenshot of data upload" lightbox="../media/how-to/evaluations/upload-data-2.png":::
153
+
:::image type="content" source="../media/how-to/evaluations/upload-data-2.png" alt-text="Screenshot of data upload." lightbox="../media/how-to/evaluations/upload-data-2.png":::
154
154
155
155
If you need a sample test file, you can use this sample `.jsonl` text. This sample contains sentences of various technical content, and we are going to be assessing semantic similarity across these sentences.
156
156
@@ -169,11 +169,11 @@ When you click into each testing criteria, you will see different types of grade
169
169
170
170
5. If you would like to create new responses using inputs from your test data, you can select 'Generate new responses'. This will inject the input fields from our evaluation file into individual prompts for a model of your choice to generate output.
171
171
172
-
:::image type="content" source="../media/how-to/evaluations/eval-generate-1.png" alt-text="Screenshot of the UX for generating model responses" lightbox="../media/how-to/evaluations/eval-generate-1.png":::
172
+
:::image type="content" source="../media/how-to/evaluations/eval-generate-1.png" alt-text="Screenshot of the UX for generating model responses." lightbox="../media/how-to/evaluations/eval-generate-1.png":::
173
173
174
174
You will select the model of your choice. If you do not have a model, you can create a new model deployment. The selected model will take the input data and generate its own unique outputs, which in this case will be stored in a variable called `{{sample.output_text}}`. We'll then use that output later as part of our testing criteria. Alternatively, you could provide your own custom system message and individual message examples manually.
175
175
176
-
:::image type="content" source="../media/how-to/evaluations/eval-generate-2.png" alt-text="Screenshot of the UX for generating model responses" lightbox="../media/how-to/evaluations/eval-generate-2.png":::
176
+
:::image type="content" source="../media/how-to/evaluations/eval-generate-2.png" alt-text="Screenshot of the UX for generating model responses." lightbox="../media/how-to/evaluations/eval-generate-2.png":::
177
177
178
178
6. For creating a test criteria, select **Add**. For the example file we provided, we are going to be assessing semantic similarity. Select **Model Scorer**, which contains test criteria presets for Semantic Similarity.
0 commit comments