Skip to content

Commit f908b67

Browse files
authored
Update evaluate-sdk.md
1 parent 2eabd71 commit f908b67

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

articles/ai-foundry/how-to/develop/evaluate-sdk.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ author: lgayhardt
1919
[!INCLUDE [feature-preview](../../includes/feature-preview.md)]
2020

2121
> [!NOTE]
22-
> Evaluation with the prompt flow SDK has been retired and replaced with Azure AI Evaluation SDK client library for Python. For more information about input data requirements, see the [API Reference Documentation](https://aka.ms/azureaieval-python-ref).
22+
> For more information about input data requirements, see the [API Reference Documentation](https://aka.ms/azureaieval-python-ref).
2323
2424
To thoroughly assess the performance of your generative AI application when applied to a substantial dataset, you can evaluate a Generative AI application in your development environment with the Azure AI evaluation SDK. Given either a test dataset or a target, your generative AI application generations are quantitatively measured with both mathematical based metrics and AI-assisted quality and safety evaluators. Built-in or custom evaluators can provide you with comprehensive insights into the application's capabilities and limitations.
2525

@@ -38,6 +38,7 @@ pip install azure-ai-evaluation
3838
Built-in evaluators support the following application scenarios:
3939

4040
- **Query and response**: This scenario is designed for applications that involve sending in queries and generating responses, usually single-turn.
41+
- **Conversation**: This scenario is designed for applications that involve sending in queries and generating responses in a multi-turn exchange.
4142
- **Retrieval augmented generation**: This scenario is suitable for applications where the model engages in generation using a retrieval-augmented approach to extract information from your provided documents and generate detailed responses, usually multi-turn.
4243

4344
For more in-depth information on each evaluator definition and how it's calculated, see [Evaluation and monitoring metrics for generative AI](../../concepts/evaluation-metrics-built-in.md).
@@ -46,7 +47,7 @@ For more in-depth information on each evaluator definition and how it's calculat
4647
|-----------|------------------------------------------------------------------------------------------------------------------------------------|
4748
| [Performance and quality](#performance-and-quality-evaluators) (AI-assisted) | `GroundednessEvaluator`, `GroundednessProEvaluator`, `RetrievalEvaluator`, `RelevanceEvaluator`, `CoherenceEvaluator`, `FluencyEvaluator`, `SimilarityEvaluator` |
4849
| [Performance and quality](#performance-and-quality-evaluators) (NLP) | `F1ScoreEvaluator`, `RougeScoreEvaluator`, `GleuScoreEvaluator`, `BleuScoreEvaluator`, `MeteorScoreEvaluator`|
49-
| [Risk and safety](#risk-and-safety-evaluators-preview) (AI-assisted) | `ViolenceEvaluator`, `SexualEvaluator`, `SelfHarmEvaluator`, `HateUnfairnessEvaluator`, `IndirectAttackEvaluator`, `ProtectedMaterialEvaluator` |
50+
| [Risk and safety](#risk-and-safety-evaluators-preview) (AI-assisted) | `ViolenceEvaluator`, `SexualEvaluator`, `SelfHarmEvaluator`, `HateUnfairnessEvaluator`, `IndirectAttackEvaluator`, `ProtectedMaterialEvaluator`, `UngroundedAttributesEvaluator`, `CodeVulnerabilityEvaluator`|
5051
| [Composite](#composite-evaluators) | `QAEvaluator`, `ContentSafetyEvaluator` |
5152

5253
Built-in quality and safety metrics take in query and response pairs, along with additional information for specific evaluators.

0 commit comments

Comments
 (0)