Skip to content

Commit aba3ae3

Browse files
authored
Acrolinx
1 parent 3139007 commit aba3ae3

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

articles/machine-learning/prompt-flow/how-to-bulk-test-evaluate-flow.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ You can select **Evaluate** to start another round of evaluation.
126126

127127
After setting up the configuration, you can select **"Submit"** for this new round of evaluation. After submission, you'll be able to see a new record in the prompt flow run list.
128128

129-
After the evaluation run completed, similarly, you can check the result of evaluation in the **"Outputs"** tab of the batch run detail panel. You need select the new evaluation run to view its result.
129+
After the evaluation run completed, similarly, you can check the result of evaluation in the **"Outputs"** tab of the batch run detail panel. You need to select the new evaluation run to view its result.
130130

131131
:::image type="content" source="./media/how-to-bulk-test-evaluate-flow/batch-run-detail-output-new-evaluation.png" alt-text="Screenshot of batch run detail page on the output tab with checking the new evaluation output." lightbox = "./media/how-to-bulk-test-evaluate-flow/batch-run-detail-output-new-evaluation.png":::
132132

@@ -183,7 +183,7 @@ System message, sometimes referred to as a metaprompt or [system prompt](../../c
183183

184184
## Further reading: Guidance for creating Golden Datasets used for Copilot quality assurance
185185

186-
The creation of copilot that use Large Language Models (LLMs) typically involves grounding the model in reality using source datasets. However, to ensure that the LLMs provide the most accurate and useful responses to customer queries, a "Golden Dataset" is necessary.
186+
The creation of a copilot that use Large Language Models (LLMs) typically involves grounding the model in reality using source datasets. However, to ensure that the LLMs provide the most accurate and useful responses to customer queries, a "Golden Dataset" is necessary.
187187

188188
A Golden Dataset is a collection of realistic customer questions and expertly crafted answers. It serves as a Quality Assurance tool for LLMs used by your copilot. Golden Datasets are not used to train an LLM or inject context into an LLM prompt. Instead, they are utilized to assess the quality of the answers generated by the LLM.
189189

0 commit comments

Comments
 (0)