Skip to content

Commit b2e62a9

Browse files
Merge pull request #5241 from lgayhardt/githubaction0525update
Github action eval project update
2 parents 1b8d135 + a061379 commit b2e62a9

File tree

1 file changed

+57
-4
lines changed

1 file changed

+57
-4
lines changed

articles/ai-foundry/how-to/evaluation-github-action.md

Lines changed: 57 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: How to run evaluation in GitHub Action to streamline the evaluation
55
manager: scottpolly
66
ms.service: azure-ai-foundry
77
ms.topic: how-to
8-
ms.date: 05/19/2025
8+
ms.date: 06/1/2025
99
ms.reviewer: hanch
1010
ms.author: lagayhar
1111
author: lgayhardt
@@ -27,7 +27,7 @@ Offline evaluation involves testing AI models and agents using test datasets to
2727

2828
## Prerequisites
2929

30-
[!INCLUDE [hub-only-prereq](../includes/hub-only-prereq.md)]
30+
Foundry project or Hubs based project. To learn more, see [Create a project](create-projects.md).
3131

3232
Two GitHub Actions are available for evaluating AI applications: **ai-agent-evals** and **genai-evals**.
3333

@@ -45,8 +45,16 @@ The input of ai-agent-evals includes:
4545

4646
**Required:**
4747

48-
- `azure-aiproject-connection-string`: The connection string for the Azure AI project. This is used to connect to Azure OpenAI to simulate conversations with each agent, and to connect to the Azure AI evaluation SDK to perform the evaluation.
49-
- `deployment-name`: the deployed model name.
48+
# [Foundry project](#tab/foundry-project)
49+
50+
- `azure-ai-project-endpoint`: The endpoint of the Azure AI project. This is used to connect to your AI project to simulate conversations with each agent, and to connect to the Azure AI evaluation SDK to perform the evaluation.
51+
52+
# [Hub based project](#tab/hub-project)
53+
54+
- `azure-aiproject-connection-string`: The connection string of the Azure AI project. This is used to connect to your AI project to simulate conversations with each agent, and to connect to the Azure AI evaluation SDK to perform the evaluation.
55+
56+
---
57+
- `deployment-name`: the deployed model name for evaluation judgement.
5058
- `data-path`: Path to the input data file containing the conversation starters. Each conversation starter is sent to each agent for a pairwise comparison of evaluation results.
5159
- `evaluators`: built-in evaluator names.
5260
- `data`: a set of conversation starters/queries.
@@ -55,6 +63,7 @@ The input of ai-agent-evals includes:
5563
- When only one `agent-id` is specified, the evaluation results include the absolute values for each metric along with the corresponding confidence intervals.
5664
- When multiple `agent-ids` are specified, the results include absolute values for each agent and a statistical comparison against the designated baseline agent ID.
5765

66+
5867
**Optional:**
5968

6069
- `api-version`: the API version of deployed model.
@@ -92,6 +101,48 @@ To use the GitHub Action, add the GitHub Action to your CI/CD workflows and spec
92101
93102
This example illustrates how Azure Agent AI Evaluation can be run when comparing different agents with agent IDs.
94103

104+
# [Foundry project](#tab/foundry-project)
105+
106+
```YAML
107+
name: "AI Agent Evaluation"
108+
109+
on:
110+
workflow_dispatch:
111+
push:
112+
branches:
113+
- main
114+
115+
permissions:
116+
id-token: write
117+
contents: read
118+
119+
jobs:
120+
run-action:
121+
runs-on: ubuntu-latest
122+
steps:
123+
- name: Checkout
124+
uses: actions/checkout@v4
125+
126+
- name: Azure login using Federated Credentials
127+
uses: azure/login@v2
128+
with:
129+
client-id: ${{ vars.AZURE_CLIENT_ID }}
130+
tenant-id: ${{ vars.AZURE_TENANT_ID }}
131+
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
132+
133+
- name: Run Evaluation
134+
uses: microsoft/ai-agent-evals@v2-beta
135+
with:
136+
# Replace placeholders with values for your Azure AI Project
137+
azure-ai-project-endpoint: "<your-ai-project-endpoint>"
138+
deployment-name: "<your-deployment-name>"
139+
agent-ids: "<your-ai-agent-ids>"
140+
data-path: ${{ github.workspace }}/path/to/your/data-file
141+
142+
```
143+
144+
# [Hub based project](#tab/hub-project)
145+
95146
```YAML
96147
name: "AI Agent Evaluation"
97148

@@ -130,6 +181,8 @@ jobs:
130181

131182
```
132183

184+
---
185+
133186
### AI agent evaluations output
134187

135188
Evaluation results are outputted to the summary section for each AI evaluation GitHub Action run under Actions in GitHub.com.

0 commit comments

Comments
 (0)