You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/prompt-flow/how-to-bulk-test-evaluate-flow.md
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -44,6 +44,8 @@ A batch run allows you to run your flow with a large dataset and generate output
44
44
45
45
To start a batch run with evaluation, you can select on the **"Batch run"** button on the top right corner of your flow page.
46
46
47
+
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-button.png" alt-text="Screenshot of Web Classification with batch run highlighted. " lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-button.png":::
48
+
47
49
To submit batch run, you can select a dataset to test your flow with. You can also select an evaluation method to calculate metrics for your flow output. If you don't want to use an evaluation method, you can skip this step and run the batch run without calculating any metrics. You can also start a new round of evaluation later.
48
50
49
51
First, you're asked to give your batch run a descriptive and recognizable name. You can also write a description and add tags (key-value pairs) to your batch run. After you finish the configuration, select **"Next"** to continue.
@@ -94,13 +96,16 @@ After submission, you can find the submitted batch run in the run list tab in pr
94
96
95
97
In the run detail page, you can select **Overview** to check the details of this batch run.
96
98
99
+
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-overview.png" alt-text="Screenshot of batch run detail page where you view detailed information. " lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-overview.png":::
97
100
98
101
In the overview panel, you can check the metadata of this run. You can also go to the **Outputs** tab in the batch run detail page to check the outputs/responses generated by the flow with the dataset that you provided. You can also select **"Export"** to export and download the outputs in a `.csv` file.
99
102
100
103
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-detail-output.png" alt-text="Screenshot of batch run detail page on the outputs tab where you check batch run outputs. " lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-detail-output.png":::
101
104
102
105
You can **select an evaluation run** from the dropdown box and you'll see appended columns at the end of the table showing the evaluation result for each row of data. You can locate the result that is falsely predicted with the output column "grade".
103
106
107
+
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-detail-output-evaluation.png" alt-text="Screenshot of batch run detail page on the outputs tab where evaluation results are appended. " lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-detail-output-evaluation.png":::
108
+
104
109
To view the overall performance, you can select the **Metrics** tab, and you can see various metrics that indicate the quality of each variant.
105
110
106
111
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-detail-metrics.png" alt-text="Screenshot of batch run detail page on the metrics tab where you check the overall performance in the metrics tab. " lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-detail-metrics.png":::
@@ -117,6 +122,8 @@ If you have already completed a batch run, you can start another round of evalua
117
122
118
123
You can select **New evaluation** to start another round of evaluation.
119
124
125
+
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-detail-new-evaluation.png" alt-text="Screenshot of batch run detail page on where to start a new round of evaluation. " lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-detail-new-evaluation.png":::
126
+
120
127
After setting up the configuration, you can select **"Submit"** for this new round of evaluation. After submission, you'll be able to see a new record in the prompt flow run list.
121
128
122
129
After the evaluation run completed, similarly, you can check the result of evaluation in the **"Overview->Output"** tab of the batch run detail page. You need select the new evaluation run to view its result.
@@ -131,6 +138,8 @@ In some scenarios, you'll modify your flow to improve its performance. You can s
131
138
132
139
To check the batch run history of your flow, you can select the **"View batch run"** button on the top right corner of your flow page. You'll see a list of batch runs that you have submitted for this flow.
133
140
141
+
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-history.png" alt-text="Screenshot of Web Classification with the view bulk runs button selected." lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-history.png":::
142
+
134
143
You can select on each batch run to check the detail. You can also select multiple batch runs and select on the **"Visualize outputs"** to compare the metrics and the outputs of these batch runs.
135
144
136
145
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-history-list.png" alt-text="Screenshot of batch run runs showing the history." lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-history-list.png":::
0 commit comments