Skip to content

Commit 854c404

Browse files
committed
Merge branch 'pf/SKintegration' of https://github.com/jiaochenlu/azure-docs-pr into pf/SKintegration
2 parents fbeda11 + 1085581 commit 854c404

File tree

1 file changed

+20
-20
lines changed

1 file changed

+20
-20
lines changed

articles/machine-learning/prompt-flow/how-to-evaluate-semantic-kernel.md

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Similar to the integration of Langchain with Prompt flow, Semantic Kernel, which
4747
> [!IMPORTANT]
4848
> Prior to developing the flow, it's essential to install the [Semantic Kernel package](/semantic-kernel/get-started/quick-start-guide/?toc=%2Fsemantic-kernel%2Ftoc.json&tabs=python) in your runtime environment for executor.
4949
50-
To learn more, see [Customize environment for runtime](./how-to-customize-environment-runtime.md) for guidance .
50+
To learn more, see [Customize environment for runtime](./how-to-customize-environment-runtime.md) for guidance.
5151

5252
> [!IMPORTANT]
5353
> The approach to consume OpenAI or Azure OpenAI in Semantic Kernel is to to obtain the keys you have specified in environment variables or stored in a `.env` file.
@@ -73,54 +73,54 @@ Once the setup is complete, you can conveniently convert your existing Semantic
7373

7474
For example, we can create a flow with a Semantic Kernel planner that solves math problems. Follow this [documentation](/semantic-kernel/ai-orchestration/planners/evaluate-and-deploy-planners/create-a-prompt-flow-with-semantic-kernel) with steps necessary to create a simple Prompt flow with Semantic Kernel at its core.
7575

76-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/semantic-kernel-flow.png" alt-text="Create a flow with semantic kernel planner" lightbox = "./media/how-to-evaluate-semantic-kernel/semantic-kernel-flow.png":::
76+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/semantic-kernel-flow.png" alt-text="Screenshot of creating a flow with semantic kernel planner." lightbox = "./media/how-to-evaluate-semantic-kernel/semantic-kernel-flow.png":::
7777

7878
Set up the connection in python code.
7979

80-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/set-connection-in-python.png" alt-text="Set custom connection in python node" lightbox = "./media/how-to-evaluate-semantic-kernel/set-connection-in-python.png":::
80+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/set-connection-in-python.png" alt-text="Screenshot of setting custom connection in python node." lightbox = "./media/how-to-evaluate-semantic-kernel/set-connection-in-python.png":::
8181

8282
Select the connection object in the node input, and set the model name of OpenAI or deployment name of Azure OpenAI.
8383

84-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/set-key-model.png" alt-text="Set model and key in node input" lightbox = "./media/how-to-evaluate-semantic-kernel/set-key-model.png":::
84+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/set-key-model.png" alt-text="Screenshot of setting model and key in node input." lightbox = "./media/how-to-evaluate-semantic-kernel/set-key-model.png":::
8585

8686
### Batch testing your plugins and planners
8787

8888
Instead of manually testing different scenarios one-by-one, now you can now automatically run large batches of tests using Prompt flow and benchmark data.
8989

90-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/using-batch-runs-with-prompt-flow.png" alt-text="Batch runs with prompt flow for Semantic kernel" lightbox = "./media/how-to-evaluate-semantic-kernel/using-batch-runs-with-prompt-flow.png":::
90+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/using-batch-runs-with-prompt-flow.png" alt-text="Screenshot of batch runs with prompt flow for Semantic kernel." lightbox = "./media/how-to-evaluate-semantic-kernel/using-batch-runs-with-prompt-flow.png":::
9191

9292
Once the flow has passed the single test run in the previous step, you can effortlessly create a batch test in Prompt flow by adhering to the following steps:
9393
1. Create benchmark data in a *jsonl* file, contains a list of JSON objects that contains the input and the correct ground truth.
9494
1. Click *Batch run* to create a batch test.
9595
1. Complete the batch run settings, especially the data part.
96-
1. Submit run without evaluation (for this specific batch test, the *Evaluation st*ep can be skipped).
96+
1. Submit run without evaluation (for this specific batch test, the *Evaluation step* can be skipped).
9797

98-
In our [updated docs](/semantic-kernel/ai-orchestration/planners/evaluate-and-deploy-planners/running-batches-with-prompt-flow?tabs=gpt-35-turbo), we demonstrate how you can use this functionality to run batch tests on a planner that uses a math plugin. By defining a bunch of word problems, we can quickly test any changes we make to our plugins or planners so we can catch regressions early and often.
98+
In our [Running batches with Prompt flow](/semantic-kernel/ai-orchestration/planners/evaluate-and-deploy-planners/running-batches-with-prompt-flow?tabs=gpt-35-turbo), we demonstrate how you can use this functionality to run batch tests on a planner that uses a math plugin. By defining a bunch of word problems, we can quickly test any changes we make to our plugins or planners so we can catch regressions early and often.
9999

100-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/semantic-kernel-test-data.png" alt-text="Data of batch runs with prompt flow for Semantic kernel" lightbox = "./media/how-to-evaluate-semantic-kernel/semantic-kernel-test-data.png":::
100+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/semantic-kernel-test-data.png" alt-text="Screenshot of data of batch runs with prompt flow for Semantic kernel." lightbox = "./media/how-to-evaluate-semantic-kernel/semantic-kernel-test-data.png":::
101101

102-
In your workspace, you can go to the **Run list** in Prompt flow, click **Details** button, and then click **Output** tab to view the batch run result.
102+
In your workspace, you can go to the **Run list** in Prompt flow, select **Details** button, and then select **Output** tab to view the batch run result.
103103

104-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/run.png" alt-text="Run list" lightbox = "./media/how-to-evaluate-semantic-kernel/run.png":::
104+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/run.png" alt-text="Screenshot of the run list." lightbox = "./media/how-to-evaluate-semantic-kernel/run.png":::
105105

106-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/run-detail.png" alt-text="Run detail" lightbox = "./media/how-to-evaluate-semantic-kernel/run-detail.png":::
106+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/run-detail.png" alt-text="Screenshot of the run detail." lightbox = "./media/how-to-evaluate-semantic-kernel/run-detail.png":::
107107

108-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/run-output.png" alt-text="Run output" lightbox = "./media/how-to-evaluate-semantic-kernel/run-output.png":::
108+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/run-output.png" alt-text="Screenshot of the run output." lightbox = "./media/how-to-evaluate-semantic-kernel/run-output.png":::
109109

110110
### Evaluating the accuracy
111111

112112
Once a batch run is completed, you then need an easy way to determine the adequacy of the test results. This information can then be used to develop accuracy scores, which can be incrementally improved.
113113

114-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-batch-run-with-prompt-flow.png" alt-text="Evaluating batch run with prompt flow" lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-batch-run-with-prompt-flow.png":::
114+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-batch-run-with-prompt-flow.png" alt-text="Screenshot of evaluating batch run with prompt flow." lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-batch-run-with-prompt-flow.png":::
115115

116116
Evaluation flows in Prompt flow enable this functionality. Using the sample evaluation flows offered by prompt flow, you can assess various metrics such as **classification accuracy**, **perceived intelligence**, **groundedness**, and more.
117117

118-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-sample-flows.png" alt-text="Evaluation flow samples" lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-sample-flows.png":::
118+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-sample-flows.png" alt-text="Screenshot showing evaluation flow samples." lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-sample-flows.png":::
119119

120120
There's also the flexibility to develop **your own custom evaluators** if needed.
121121
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/my-evaluator.png" alt-text="My custom evaluation flow" lightbox = "./media/how-to-evaluate-semantic-kernel/my-evaluator.png":::
122122

123-
In Prompt flow, you can quick create an evaluation run based on an completed batch run by following the steps below:
123+
In Prompt flow, you can quick create an evaluation run based on a completed batch run by following the steps below:
124124
1. Prepare the evaluation flow and the complete a batch run.
125125
1. Click *Run* tab in home page to go to the run list.
126126
1. Go into the previous completed batch run.
@@ -129,9 +129,9 @@ In Prompt flow, you can quick create an evaluation run based on an completed bat
129129
1. Submit run and wait for the result.
130130

131131

132-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/add-evaluation.png" alt-text="Add new evaluation" lightbox = "./media/how-to-evaluate-semantic-kernel/add-evaluation.png":::
132+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/add-evaluation.png" alt-text="Screenshot showing add new evaluation." lightbox = "./media/how-to-evaluate-semantic-kernel/add-evaluation.png":::
133133

134-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-setting.png" alt-text="Evaluation settings" lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-setting.png":::
134+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-setting.png" alt-text="Screenshot showing evaluation settings." lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-setting.png":::
135135

136136

137137
Follow this [documentation](/semantic-kernel/ai-orchestration/planners/evaluate-and-deploy-planners/evaluating-plugins-and-planners-with-prompt-flow?tabs=gpt-35-turbo) for Semantic Kernel to learn more about how to use the [math accuracy evaluation flow](https://github.com/microsoft/promptflow/tree/main/examples/flows/evaluation/eval-accuracy-maths-to-code) to test our planner to see how well it solves word problems.
@@ -140,11 +140,11 @@ After running the evaluator, you’ll get a summary back of your metrics. Initia
140140

141141
To check the metrics, you can go back to the batch run detail page, click **Details** button, and then click **Output** tab, select the evaluation run name in the dropdown list to view the evaluation result.
142142

143-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-result.png" alt-text="Evaluation result" lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-result.png":::
143+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-result.png" alt-text="Screenshot showing evaluation result." lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-result.png":::
144144

145145
You can check the aggregated metric in the **Metrics** tab.
146146

147-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-metrics.png" alt-text="Evaluation metrics" lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-metrics.png":::
147+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/evaluation-metrics.png" alt-text="Screenshot showing evaluation metrics." lightbox = "./media/how-to-evaluate-semantic-kernel/evaluation-metrics.png":::
148148

149149

150150
### Experiments for quality improvement
@@ -169,7 +169,7 @@ To compare, select the runs you wish to analyze, then select the **Visualize out
169169

170170
This will present you with a detailed table, line-by-line comparison of the results from selected runs.
171171

172-
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/compare-detail.png" alt-text="Compare runs details" lightbox = "./media/how-to-evaluate-semantic-kernel/compare-detail.png":::
172+
:::image type="content" source="./media/how-to-evaluate-semantic-kernel/compare-detail.png" alt-text="Screenshot of compare runs details." lightbox = "./media/how-to-evaluate-semantic-kernel/compare-detail.png":::
173173

174174
## Next steps
175175

0 commit comments

Comments
 (0)