Skip to content

Commit dd9bd57

Browse files
Merge pull request #7173 from lgayhardt/evalcicd0925
Eval Github Action and ADO tool and groundness supported
2 parents 9952f68 + 81e6d89 commit dd9bd57

File tree

3 files changed

+5
-5
lines changed

3 files changed

+5
-5
lines changed

articles/ai-foundry/how-to/evaluation-azure-devops.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure AI Foundry
44
description: How to run evaluation in Azure DevOps which enables offline evaluation of AI models within your CI/CD pipelines in Azure DevOps.
55
ms.service: azure-ai-foundry
66
ms.topic: how-to
7-
ms.date: 07/25/2025
7+
ms.date: 09/19/2025
88
ms.reviewer: hanch
99
ms.author: lagayhar
1010
author: lgayhardt

articles/ai-foundry/how-to/evaluation-github-action.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure AI Foundry
44
description: How to run evaluation in GitHub Action to streamline the evaluation process, allowing you to assess model performance and make informed decisions before deploying to production.
55
ms.service: azure-ai-foundry
66
ms.topic: how-to
7-
ms.date: 08/18/2025
7+
ms.date: 09/19/2025
88
ms.reviewer: hanch
99
ms.author: lagayhar
1010
author: lgayhardt

articles/ai-foundry/includes/evaluation-github-action-azure-devops-features.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Include file
44
author: lgayhardt
55
ms.service: azure-ai-foundry
66
ms.topic: include
7-
ms.date: 5/08/2025
7+
ms.date: 9/19/2025
88
ms.author: lagayhar
99
ms.custom: include file
1010
---
@@ -28,7 +28,7 @@ ms.custom: include file
2828
| Textual similarity | [GleuScoreEvaluator](../concepts/evaluation-evaluators/textual-similarity-evaluators.md#gleu-score) | Not Supported | Supported |
2929
| Textual similarity | [BleuScoreEvaluator](../concepts/evaluation-evaluators/textual-similarity-evaluators.md#bleu-score) | Not Supported | Supported |
3030
| Textual similarity | [MeteorScoreEvaluator](../concepts/evaluation-evaluators/textual-similarity-evaluators.md#meteor-score) | Not Supported | Supported |
31-
| Retrieval-augmented Generation (RAG) (AI-assisted) | [GroundednessEvaluator](../concepts/evaluation-evaluators/rag-evaluators.md#groundedness) | Not Supported | Supported |
31+
| Retrieval-augmented Generation (RAG) (AI-assisted) | [GroundednessEvaluator](../concepts/evaluation-evaluators/rag-evaluators.md#groundedness) | Supported | Supported |
3232
| Retrieval-augmented Generation (RAG) (AI-assisted) | [GroundednessProEvaluator](../concepts/evaluation-evaluators/rag-evaluators.md#groundedness-pro) | Not Supported | Supported |
3333
| Retrieval-augmented Generation (RAG) (AI-assisted) | [RetrievalEvaluator](../concepts/evaluation-evaluators/rag-evaluators.md#relevance) | Not Supported | Supported |
3434
| Retrieval-augmented Generation (RAG) (AI-assisted) | [RelevanceEvaluator](../concepts/evaluation-evaluators/rag-evaluators.md#retrieval) | Supported | Supported |
@@ -45,7 +45,7 @@ ms.custom: include file
4545
| Risk and safety (AI-assisted) | [ContentSafetyEvaluator](../concepts/evaluation-evaluators/risk-safety-evaluators.md#content-safety-composite-evaluator) | Supported | Supported |
4646
| Agent (AI-assisted) | [IntentResolutionEvaluator](../concepts/evaluation-evaluators/agent-evaluators.md#intent-resolution) | Supported | Supported |
4747
| Agent (AI-assisted) | [TaskAdherenceEvaluator](../concepts/evaluation-evaluators/agent-evaluators.md#task-adherence) | Supported | Supported |
48-
| Agent (AI-assisted) | [ToolCallAccuracyEvaluator](../concepts/evaluation-evaluators/agent-evaluators.md#tool-call-accuracy) | Not Supported | Not Supported |
48+
| Agent (AI-assisted) | [ToolCallAccuracyEvaluator](../concepts/evaluation-evaluators/agent-evaluators.md#tool-call-accuracy) | Supported | Supported |
4949
| Composite | `AgentOverallEvaluator` | Not Supported | Not Supported |
5050
| Operational metrics | Client run duration | Supported | Not Supported |
5151
| Operational metrics | Server run duration | Supported | Not Supported |

0 commit comments

Comments
 (0)