Skip to content

Commit 37e2328

Browse files
committed
formating
1 parent e812913 commit 37e2328

File tree

5 files changed

+13
-9
lines changed

5 files changed

+13
-9
lines changed

articles/ai-foundry/concepts/evaluation-evaluators/agent-evaluators.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,9 @@ model_config = AzureOpenAIModelConfiguration(
5757

5858
We support AzureOpenAI or OpenAI [reasoning models](../../../ai-services/openai/how-to/reasoning.md) and non-reasoning models for the LLM-judge depending on the evaluators:
5959

60-
| Evaluators | Reasoning Models as Judge (ex: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (ex: gpt-4.1, gpt-4o, etc.) | To enable |
61-
|------------|-----------------------------------------------------------------------------|-------------------------------------------------------------|-------|
62-
| `Intent Resolution` / `Task Adherence` / `Tool Call Accuracy` / `Response Completeness`) | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
60+
| Evaluators | Reasoning Models as Judge (example: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (example: gpt-4.1, gpt-4o, etc.) | To enable |
61+
|--|--|--|--|
62+
| `Intent Resolution`, `Task Adherence`, `Tool Call Accuracy`, `Response Completeness` | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
6363
| Other quality evaluators| Not Supported | Supported | -- |
6464

6565
For complex evaluation that requires refined reasoning, we recommend a strong reasoning model like `o3-mini` and o-series mini models released afterwards with a balance of reasoning performance and cost efficiency.

articles/ai-foundry/concepts/evaluation-evaluators/general-purpose-evaluators.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,9 +45,9 @@ model_config = AzureOpenAIModelConfiguration(
4545

4646
We support AzureOpenAI or OpenAI [reasoning models](../../../ai-services/openai/how-to/reasoning.md) and non-reasoning models for the LLM-judge depending on the evaluators:
4747

48-
| Evaluators | Reasoning Models as Judge (ex: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (ex: gpt-4.1, gpt-4o, etc.) | To enable |
48+
| Evaluators | Reasoning Models as Judge (example: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (example: gpt-4.1, gpt-4o, etc.) | To enable |
4949
|------------|-----------------------------------------------------------------------------|-------------------------------------------------------------|-------|
50-
| `Intent Resolution` / `Task Adherence` / `Tool Call Accuracy` / `Response Completeness`) | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
50+
| `Intent Resolution`, `Task Adherence`, `Tool Call Accuracy`, `Response Completeness` | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
5151
| Other quality evaluators| Not Supported | Supported | -- |
5252

5353
For complex evaluation that requires refined reasoning, we recommend a strong reasoning model like `o3-mini` and o-series mini models released afterwards with a balance of reasoning performance and cost efficiency.

articles/ai-foundry/concepts/evaluation-evaluators/rag-evaluators.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ We support AzureOpenAI or OpenAI [reasoning models](../../../ai-services/openai/
5757

5858
| Evaluators | Reasoning Models as Judge (example: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (example: gpt-4.1, gpt-4o, etc.) | To enable |
5959
|--|--|--|--|
60-
| `Intent Resolution` / `Task Adherence` / `Tool Call Accuracy` / `Response Completeness` | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
60+
| `Intent Resolution`, `Task Adherence`, `Tool Call Accuracy`, `Response Completeness` | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
6161
| Other quality evaluators| Not Supported | Supported | -- |
6262

6363
For complex evaluation that requires refined reasoning, we recommend a strong reasoning model like `o3-mini` and o-series mini models released afterwards with a balance of reasoning performance and cost efficiency.

articles/ai-foundry/how-to/develop/agent-evaluate-sdk.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -170,9 +170,9 @@ converted_data = converter.convert(thread_id, run_id)
170170

171171
And that's it! `converted_data` contains all inputs required for [these evaluators](#evaluators-supported-for-evaluation-data-converter). You don't need to read the input requirements for each evaluator and do any work to parse the inputs. All you need to do is select your evaluator and call the evaluator on this single run. We support AzureOpenAI or OpenAI [reasoning models](../../../ai-services/openai/how-to/reasoning.md) and non-reasoning models for the judge depending on the evaluators:
172172

173-
| Evaluators | Reasoning Models as Judge (ex: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (ex: gpt-4.1, gpt-4o, etc.) | To enable |
174-
|------------|-----------------------------------------------------------------------------|-------------------------------------------------------------|-------|
175-
| `Intent Resolution` / `Task Adherence` / `Tool Call Accuracy` / `Response Completeness`) | Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
173+
| Evaluators | Reasoning Models as Judge (example: o-series models from Azure OpenAI / OpenAI) | Non-reasoning models as Judge (example: gpt-4.1, gpt-4o, etc.) | To enable |
174+
|--|--|--|--|
175+
| `Intent Resolution`, `Task Adherence`, `Tool Call Accuracy`, `Response Completeness`| Supported | Supported | Set additional parameter `is_reasoning_model=True` in initializing evaluators |
176176
| Other quality evaluators| Not Supported | Supported | -- |
177177

178178
For complex tasks that require refined reasoning for the evaluation, we recommend a strong reasoning model like `o3-mini` or the o-series mini models released afterwards with a balance of reasoning performance and cost efficiency.

articles/ai-foundry/how-to/develop/evaluate-sdk.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ Built-in quality and safety metrics take in query and response pairs, along with
5252

5353
Built-in evaluators can accept query and response pairs, a list of conversations in JSON Lines (JSONL) format, or both.
5454

55+
5556
**Quality Evaluators:**
5657

5758
| Evaluator | Conversation & single-turn support for text | Conversation & single-turn support for text and image | Single-turn support for text only | Requires `ground_truth` | Supports [agent inputs](./agent-evaluate-sdk.md#agent-messages) |
@@ -69,6 +70,7 @@ Built-in evaluators can accept query and response pairs, a list of conversations
6970
| `ResponseCompletenessEvaluator` || ||| |
7071
| `QAEvaluator` | | ||| |
7172

73+
7274
**Natural Language Processing (NLP) Evaluators:**
7375

7476
| Evaluator | Conversation & single-turn support for text | Conversation & single-turn support for text and image | Single-turn support for text only | Requires `ground_truth` | Supports [agent inputs](./agent-evaluate-sdk.md#agent-messages) |
@@ -80,6 +82,7 @@ Built-in evaluators can accept query and response pairs, a list of conversations
8082
| `BleuScoreEvaluator` | | ||| |
8183
| `MeteorScoreEvaluator` | | ||| |
8284

85+
8386
**Safety Evaluators:**
8487

8588
| Evaluator | Conversation & single-turn support for text | Conversation & single-turn support for text and image | Single-turn support for text only | Requires `ground_truth` | Supports [agent inputs](./agent-evaluate-sdk.md#agent-messages) |
@@ -94,6 +97,7 @@ Built-in evaluators can accept query and response pairs, a list of conversations
9497
| `CodeVulnerabilityEvaluator` | | || ||
9598
| `IndirectAttackEvaluator` || | | ||
9699

100+
97101
**Azure OpenAI Graders:**
98102

99103
| Evaluator | Conversation & single-turn support for text | Conversation & single-turn support for text and image | Single-turn support for text only | Requires `ground_truth` | Supports [agent inputs](./agent-evaluate-sdk.md#agent-messages) |

0 commit comments

Comments
 (0)