Skip to content

Commit 7be53be

Browse files
committed
Github action eval project update
1 parent 8df1b1d commit 7be53be

File tree

1 file changed

+23
-4
lines changed

1 file changed

+23
-4
lines changed

articles/ai-foundry/how-to/evaluation-github-action.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Offline evaluation involves testing AI models and agents using test datasets to
2727

2828
## Prerequisites
2929

30-
[!INCLUDE [hub-only-prereq](../includes/hub-only-prereq.md)]
30+
Foundry project or Hubs based project. To learn more, see [Create a project](create-projects.md).
3131

3232
Two GitHub Actions are available for evaluating AI applications: **ai-agent-evals** and **genai-evals**.
3333

@@ -41,12 +41,20 @@ Two GitHub Actions are available for evaluating AI applications: **ai-agent-eval
4141

4242
### AI agent evaluations input
4343

44-
The input of ai-agent-evals includes:
44+
The required inputs of ai-agent-evals include:
4545

4646
**Required:**
4747

48-
- `azure-aiproject-connection-string`: The connection string for the Azure AI project. This is used to connect to Azure OpenAI to simulate conversations with each agent, and to connect to the Azure AI evaluation SDK to perform the evaluation.
49-
- `deployment-name`: the deployed model name.
48+
# [Foundry project](#tab/foundry-project)
49+
50+
- `azure-ai-project-endpoint`: The endpoint of the Azure AI project. This is used to connect to Azure OpenAI to simulate conversations with each agent, and to connect to the Azure AI evaluation SDK to perform the evaluation.
51+
52+
# [Hub based project](#tab/hub-project)
53+
54+
- `azure-aiproject-connection-string`: The connection string of the Azure AI project. This is used to connect to Azure OpenAI to simulate conversations with each agent, and to connect to the Azure AI evaluation SDK to perform the evaluation.
55+
56+
---
57+
- `deployment-name`: the deployed model name for evaluation judgement.
5058
- `data-path`: Path to the input data file containing the conversation starters. Each conversation starter is sent to each agent for a pairwise comparison of evaluation results.
5159
- `evaluators`: built-in evaluator names.
5260
- `data`: a set of conversation starters/queries.
@@ -55,6 +63,7 @@ The input of ai-agent-evals includes:
5563
- When only one `agent-id` is specified, the evaluation results include the absolute values for each metric along with the corresponding confidence intervals.
5664
- When multiple `agent-ids` are specified, the results include absolute values for each agent and a statistical comparison against the designated baseline agent ID.
5765

66+
5867
**Optional:**
5968

6069
- `api-version`: the API version of deployed model.
@@ -87,6 +96,16 @@ Here's a sample of the dataset:
8796

8897
To use the GitHub Action, add the GitHub Action to your CI/CD workflows and specify the trigger criteria (for example, on commit) and file paths to trigger your automated workflows.
8998

99+
# [Foundry project](#tab/foundry-project)
100+
101+
Specify v2-beta.
102+
103+
# [Hub based project](#tab/hub-project)
104+
105+
Specify v1-beta.
106+
107+
---
108+
90109
> [!TIP]
91110
> To minimize costs, you should avoid running evaluation on every commit.
92111

0 commit comments

Comments
 (0)