Skip to content

Commit b7a8dac

Browse files
Merge pull request #275644 from lgayhardt/evalplayground/release-build-azure-ai-studio-follow
Add Screenshots for eval prompts playground
2 parents 134e7fa + ebc8b27 commit b7a8dac

File tree

4 files changed

+8
-6
lines changed

4 files changed

+8
-6
lines changed

articles/ai-studio/how-to/evaluate-prompts-playground.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,12 @@ In this article you learn to:
3434

3535
To generate manual evaluation results, you need to have the following ready:
3636

37-
* A test dataset in one of these formats: csv or jsonl. If you don't have a dataset available, we also allow you to input data manually from the UI.
37+
* A test dataset in one of these formats: csv or jsonl. If you don't have a dataset available, we also allow you to input data manually from the UI.
3838

39-
* A deployment of one of these models: GPT 3.5 models, GPT 4 models, or Davinci models. Learn more about how to create a deployment [here](./deploy-models-openai.md).
39+
* A deployment of one of these models: GPT 3.5 models, GPT 4 models, or Davinci models. To learn more about how to create a deployment, see [Deploy models](./deploy-models-openai.md).
40+
41+
> [!NOTE]
42+
> Manual evaluation is only supported for Azure OpenAI models at this time for chat and completion task types.
4043
4144
## Generate your manual evaluation results
4245

@@ -46,8 +49,7 @@ This can be done manually using the text boxes in the **Input** column.
4649

4750
You can also **Import Data** to choose one of your previous existing datasets in your project or upload a dataset that is in CSV or JSONL format. After loading your data, you'll be prompted to map the columns appropriately. Once you finish and select **Import**, the data is populated appropriately in the columns below.
4851

49-
:::image type="content" source="../media/evaluations/prompts/generate-manual-eval-results.gif" alt-text="GIF of generating manual evaluation results." lightbox= "../media/evaluations/prompts/generate-manual-eval-results.gif":::
50-
52+
:::image type="content" source="../media/evaluations/prompts/generate-manual-evaluation-results.png" alt-text="Screenshot of generating manual evaluation results." lightbox= "../media/evaluations/prompts/generate-manual-evaluation-results.png":::
5153

5254
> [!NOTE]
5355
> You can add as many as 50 input rows to your manual evaluation. If your test data has more than 50 input rows, we will upload the first 50 in the input column.
@@ -58,7 +60,7 @@ Now that your data is added, you can **Run** to populate the output column with
5860

5961
You can provide a thumb up or down rating to each response to assess the prompt output. Based on the ratings you provided, you can view these response scores in the at-a-glance summaries.
6062

61-
:::image type="content" source="../media/evaluations/prompts/rate-results.gif" alt-text="GIF of response scores in the at-a-glance summaries." lightbox= "../media/evaluations/prompts/rate-results.gif":::
63+
:::image type="content" source="../media/evaluations/prompts/rate-results.png" alt-text="Screenshot of response scores in the at-a-glance summaries." lightbox= "../media/evaluations/prompts/rate-results.png":::
6264

6365
## Iterate on your prompt and reevaluate
6466

@@ -70,7 +72,7 @@ After making your edits, you can choose to rerun all to update the entire table
7072

7173
After populating your results, you can **Save results** to share progress with your team or to continue your manual evaluation from where you left off later.
7274

73-
:::image type="content" source="../media/evaluations/prompts/save-and-compare-results.gif" alt-text="GIF of the save results workflow." lightbox= "../media/evaluations/prompts/save-and-compare-results.gif":::
75+
:::image type="content" source="../media/evaluations/prompts/save-and-compare-results.png" alt-text="Screenshot of the save results." lightbox= "../media/evaluations/prompts/save-and-compare-results.png":::
7476

7577
You can also compare the thumbs up and down ratings across your different manual evaluations by saving them and viewing them in the Evaluation tab under Manual evaluation.
7678

323 KB
Loading
246 KB
Loading
282 KB
Loading

0 commit comments

Comments
 (0)