Skip to content

Commit cc98cb9

Browse files
committed
minor update
2 parents f8fd579 + 436ab8c commit cc98cb9

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

articles/ai-studio/how-to/develop/evaluate-sdk.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ You can use our built-in AI-assisted and NLP quality evaluators to assess the pe
133133

134134
#### Set up
135135

136-
1. For AI-assisted quality evaluators except for `GroundednessProEvaluator`, you must specify a GPT model to act as a judge to score the evaluation data. Choose a deployment with either GPT-3.5, GPT-4, GPT-4o or GPT-4-mini model for your calculations and set it as your `model_config`. We support both Azure OpenAI or OpenAI model configuration schema. We recommend using GPT models that do not have the `(preview)` suffix for the best performance and parseable responses with our evaluators.
136+
1. For AI-assisted quality evaluators except for `GroundednessProEvaluator`, you must specify a GPT model to act as a judge to score the evaluation data. Choose a deployment with either GPT-3.5, GPT-4, GPT-4o or GPT-4-mini model for your calculations and set it as your `model_config`. We support both Azure OpenAI or OpenAI model configuration schema. We recommend using GPT models that don't have the `(preview)` suffix for the best performance and parseable responses with our evaluators.
137137

138138
> [!NOTE]
139139
> Make sure the you have at least `Cognitive Services OpenAI User` role for the Azure OpenAI resource to make inference calls with API key. For more permissions, learn more about [permissioning for Azure OpenAI resource](../../../ai-services/openai/how-to/role-based-access-control.md#summary).
@@ -217,7 +217,7 @@ For NLP evaluators, only a score is given in the `{metric_name}` key.
217217

218218
Like 6 other AI-assisted evaluators, `GroundednessEvaluator` is a prompt-based evaluator that outputs a score on a 5-point scale (the higher the score, the more grounded the result is). On the other hand, `GroundednessProEvaluator` invokes our backend evaluation service powered by Azure AI Content Safety and outputs `True` if all content is grounded, or `False` if any ungrounded content is detected.
219219

220-
We open-source the prompts of our quality evaluators except for `GroundednessProEvaluator` (powered by Azure AI Content Safety) for transparency. These prompts serve as instructions for a language model to perform their evaluation task, which requires a human-friendly definition of the metric and its associated scoring rubrics (what the 5 levels of quality means for the metric). We highly recommend that users customize the definitions and grading rubrics to their scenario specifics. See details in [Custom Evaluators](#custom-evaluators).
220+
We open-source the prompts of our quality evaluators except for `GroundednessProEvaluator` (powered by Azure AI Content Safety) for transparency. These prompts serve as instructions for a language model to perform their evaluation task, which requires a human-friendly definition of the metric and its associated scoring rubrics (what the 5 levels of quality mean for the metric). We highly recommend that users customize the definitions and grading rubrics to their scenario specifics. See details in [Custom Evaluators](#custom-evaluators).
221221

222222
For conversation mode, here is an example for `GroundednessEvaluator`:
223223

@@ -283,7 +283,7 @@ credential = DefaultAzureCredential()
283283

284284
# Initializing Violence Evaluator with project information
285285
violence_eval = ViolenceEvaluator(credential=credential, azure_ai_project=azure_ai_project)
286-
# Running Violence Evaluator on a query and respnose pair
286+
# Running Violence Evaluator on a query and response pair
287287
violence_score = violence_eval(query="What is the capital of France?", answer="Paris.")
288288
print(violence_score)
289289

@@ -665,7 +665,7 @@ After local evaluations of your generative AI applications, you may want to trig
665665

666666

667667
### Prerequisites
668-
- Azure AI project in the same [regions](#region-support) as risk and safety evaluators. If you do not have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one.
668+
- Azure AI project in the same [regions](#region-support) as risk and safety evaluators. If you don't have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one.
669669

670670
> [!NOTE]
671671
> Remote evaluations do not support `Groundedness-Pro-Evaluator`, `Retrieval-Evaluator`, `Protected-Material-Evaluator`, `Indirect-Attack-Evaluator`, `ContentSafetyEvaluator`, and `QAEvaluator`.
@@ -685,7 +685,7 @@ After local evaluations of your generative AI applications, you may want to trig
685685
```bash
686686
pip install azure-identity azure-ai-projects azure-ai-ml
687687
```
688-
Optionally, you can `pip install azure-ai-evaluation` if you want a code-first experience to fetch evaluator id for built-in evaluators in code.
688+
Optionally you can `pip install azure-ai-evaluation` if you want a code-first experience to fetch evaluator id for built-in evaluators in code.
689689

690690
Now you can define a client and a deployment which will be used to run your remote evaluations:
691691
```python
@@ -731,7 +731,7 @@ from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, ViolenceEv
731731
print("F1 Score evaluator id:", F1ScoreEvaluator.id)
732732
```
733733
734-
- **From UI**: Follows these steps to fetch evaluator ids after they are registered to your project:
734+
- **From UI**: Follows these steps to fetch evaluator ids after they're registered to your project:
735735
- Select **Evaluation** tab in your Azure AI project;
736736
- Select Evaluator library;
737737
- Select your evaluator(s) of choice by comparing the descriptions;

0 commit comments

Comments
 (0)