Skip to content

Commit 5bfe50d

Browse files
committed
edits
1 parent e78f3db commit 5bfe50d

File tree

5 files changed

+14
-14
lines changed

5 files changed

+14
-14
lines changed

articles/ai-studio/.openpublishing.redirection.ai-studio.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@
122122
},
123123
{
124124
"source_path": "articles/ai-studio/how-to/develop/flow-evaluate-sdk.md",
125-
"redirect_url": "/azure/ai-studio/how-to/develop/evaluate-sdk.md",
125+
"redirect_url": "/azure/ai-studio/how-to/develop/evaluate-sdk",
126126
"redirect_document_id": true
127127
}
128128
]

articles/ai-studio/how-to/develop/evaluate-sdk.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ For more in-depth information on each evaluator definition and how it's calculat
3838

3939
| Category | Evaluator class |
4040
|-----------|------------------------------------------------------------------------------------------------------------------------------------|
41-
| [Performance and quality](###Performance-and-quality-evaluators) (AI-assisted) | `GroundednessEvaluator`, `RelevanceEvaluator`, `CoherenceEvaluator`, `FluencyEvaluator`, `SimilarityEvaluator` |
42-
| [Performance and quality](###Performance-and-quality-evaluators) (traditional ML) | `F1ScoreEvaluator`, `RougeScoreEvaluator`, `GleuScoreEvaluator`, `BleuScoreEvaluator`, `MeteorScoreEvaluator`|
43-
| [Risk and safety](###Risk-and-safety-evaluators ) (AI-assisted) | `ViolenceEvaluator`, `SexualEvaluator`, `SelfHarmEvaluator`, `HateUnfairnessEvaluator`, `IndirectAttackEvaluator`, `ProtectedMaterialEvaluator` |
44-
| [Composite](###Composite-evaluators) | `QAEvaluator`, `ContentSafetyEvaluator` |
41+
| [Performance and quality](#performance-and-quality-evaluators) (AI-assisted) | `GroundednessEvaluator`, `RelevanceEvaluator`, `CoherenceEvaluator`, `FluencyEvaluator`, `SimilarityEvaluator` |
42+
| [Performance and quality](#performance-and-quality-evaluators) (traditional ML) | `F1ScoreEvaluator`, `RougeScoreEvaluator`, `GleuScoreEvaluator`, `BleuScoreEvaluator`, `MeteorScoreEvaluator`|
43+
| [Risk and safety](#risk-and-safety-evaluators ) (AI-assisted) | `ViolenceEvaluator`, `SexualEvaluator`, `SelfHarmEvaluator`, `HateUnfairnessEvaluator`, `IndirectAttackEvaluator`, `ProtectedMaterialEvaluator` |
44+
| [Composite](#composite-evaluators) | `QAEvaluator`, `ContentSafetyEvaluator` |
4545

4646
Built-in quality and safety metrics take in query and response pairs, along with additional information for specific evaluators.
4747

@@ -161,11 +161,11 @@ You can do this with functionality and attack datasets generated with the [direc
161161

162162
### Composite evaluators
163163
Composite evaluators are built in evaluators that combine the individual quality or safety metrics to easily provide a wide range of metrics right out of the box for both query response pairs or chat messages.
164-
| Composite evaluator | Contains | Description |
165-
|------------------------------|------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
166-
| `QAEvaluator` | `GroundednessEvaluator`, `RelevanceEvaluator`, `CoherenceEvaluator`, `FluencyEvaluator`, `SimilarityEvaluator`, `F1ScoreEvaluator` | Combines all the quality evaluators for a single output of combined metrics for query and response pairs |
167-
| `ContentSafetyEvaluator` | `ViolenceEvaluator`, `SexualEvaluator`, `SelfHarmEvaluator`, `HateUnfairnessEvaluator` | Combines all the safety evaluators for a single output of combined metrics for query and response pairs |
168164

165+
| Composite evaluator | Contains | Description |
166+
|--|--|--|
167+
| `QAEvaluator` | `GroundednessEvaluator`, `RelevanceEvaluator`, `CoherenceEvaluator`, `FluencyEvaluator`, `SimilarityEvaluator`, `F1ScoreEvaluator` | Combines all the quality evaluators for a single output of combined metrics for query and response pairs |
168+
| `ContentSafetyEvaluator` | `ViolenceEvaluator`, `SexualEvaluator`, `SelfHarmEvaluator`, `HateUnfairnessEvaluator` | Combines all the safety evaluators for a single output of combined metrics for query and response pairs |
169169

170170
## Custom evaluators
171171

@@ -386,7 +386,7 @@ The `evaluate()` API has a few requirements for the data format that it accepts
386386

387387
#### Data format
388388

389-
The `evaluate()` API only accepts data in the JSONLines format. For all built-in evaluators, `evaluate()` requires data in the following format with required input fields. See the [previous section on required data input for built-in evaluators](###data-requirements-for-built-in-evaluators).
389+
The `evaluate()` API only accepts data in the JSONLines format. For all built-in evaluators, `evaluate()` requires data in the following format with required input fields. See the [previous section on required data input for built-in evaluators](#data-requirements-for-built-in-evaluators).
390390

391391
```json
392392
{

articles/ai-studio/how-to/evaluate-generative-ai-app.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ From the flow page: From the collapsible left menu, select **Prompt flow** > **E
163163
The evaluator library is a centralized place that allows you to see the details and status of your evaluators. You can view and manage Microsoft curated evaluators.
164164

165165
> [!TIP]
166-
> You can use custom evaluators via the prompt flow SDK. For more information, see [Evaluate with the prompt flow SDK](../how-to/develop/flow-evaluate-sdk.md#custom-evaluators).
166+
> You can use custom evaluators via the prompt flow SDK. For more information, see [Evaluate with the prompt flow SDK](../how-to/develop/evaluate-sdk.md#custom-evaluators).
167167
168168
The evaluator library also enables version management. You can compare different versions of your work, restore previous versions if needed, and collaborate with others more easily.
169169

@@ -172,7 +172,7 @@ To use the evaluator library in AI Studio, go to your project's **Evaluation** p
172172
:::image type="content" source="../media/evaluations/evaluate/evaluator-library-list.png" alt-text="Screenshot of the page to select evaluators from the evaluator library." lightbox="../media/evaluations/evaluate/evaluator-library-list.png":::
173173

174174
You can select the evaluator name to see more details. You can see the name, description, and parameters, and check any files associated with the evaluator. Here are some examples of Microsoft curated evaluators:
175-
- For performance and quality evaluators curated by Microsoft, you can view the annotation prompt on the details page. You can adapt these prompts to your own use case by changing the parameters or criteria according to your data and objectives [with the prompt flow SDK](../how-to/develop/flow-evaluate-sdk.md#custom-evaluators). For example, you can select *Groundedness-Evaluator* and check the Prompty file showing how we calculate the metric.
175+
- For performance and quality evaluators curated by Microsoft, you can view the annotation prompt on the details page. You can adapt these prompts to your own use case by changing the parameters or criteria according to your data and objectives [with the prompt flow SDK](../how-to/develop/evaluate-sdk.md#custom-evaluators). For example, you can select *Groundedness-Evaluator* and check the Prompty file showing how we calculate the metric.
176176
- For risk and safety evaluators curated by Microsoft, you can see the definition of the metrics. For example, you can select the *Self-Harm-Related-Content-Evaluator* and learn what it means and how Microsoft determines the various severity levels for this safety metric
177177

178178

articles/ai-studio/quickstarts/get-started-code.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,7 @@ You should see an output that looks like this:
271271

272272
Looks like we scored 5 for coherence and fluency of the LLM responses on this conversation!
273273

274-
For more information on how to use prompt flow evaluators, including how to make your own custom evaluators and log evaluation results to AI Studio, be sure to check out [Evaluate your app using the prompt flow SDK](../how-to/develop/flow-evaluate-sdk.md).
274+
For more information on how to use prompt flow evaluators, including how to make your own custom evaluators and log evaluation results to AI Studio, be sure to check out [Evaluate your app using the prompt flow SDK](../how-to/develop/evaluate-sdk.md).
275275

276276

277277
## Next step

articles/ai-studio/tutorials/copilot-sdk-evaluate-deploy.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ The main function at the end allows you to view the evaluation result locally, a
9999
python evaluate.py
100100
```
101101

102-
For more information about using the prompt flow SDK for evaluation, see [Evaluate with the prompt flow SDK](../how-to/develop/flow-evaluate-sdk.md).
102+
For more information about using the prompt flow SDK for evaluation, see [Evaluate with the prompt flow SDK](../how-to/develop/evaluate-sdk.md).
103103

104104
### Interpret the evaluation output
105105

0 commit comments

Comments
 (0)