Skip to content

Commit e9c44be

Browse files
Merge pull request #6170 from MicrosoftDocs/main
Auto Publish – main to live - 2025-07-22 22:09 UTC
2 parents cd3bbe8 + 864d566 commit e9c44be

39 files changed

+583
-450
lines changed

articles/ai-foundry/concepts/evaluation-evaluators/agent-evaluators.md

Lines changed: 33 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ load_dotenv()
4747

4848
model_config = AzureOpenAIModelConfiguration(
4949
azure_endpoint=os.environ["AZURE_ENDPOINT"],
50-
api_key=os.environ.get["AZURE_API_KEY"],
50+
api_key=os.environ.get("AZURE_API_KEY"),
5151
azure_deployment=os.environ.get("AZURE_DEPLOYMENT_NAME"),
5252
api_version=os.environ.get("AZURE_API_VERSION"),
5353
)
@@ -83,7 +83,7 @@ intent_resolution(
8383

8484
### Intent resolution output
8585

86-
The numerical score on a Likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason and additional fields can help you understand why the score is high or low.
86+
The numerical score is on a Likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason and additional fields can help you understand why the score is high or low.
8787

8888
```python
8989
{
@@ -104,14 +104,17 @@ The numerical score on a Likert scale (integer 1 to 5) and a higher score is bet
104104

105105
```
106106

107-
If you're building agents outside of Azure AI Agent Serice, this evaluator accepts a schema typical for agent messages. To learn more, see our sample notebook for [Intent Resolution](https://aka.ms/intentresolution-sample).
107+
If you're building agents outside of Azure AI Agent Service, this evaluator accepts a schema typical for agent messages. To learn more, see our sample notebook for [Intent Resolution](https://aka.ms/intentresolution-sample).
108108

109109
## Tool call accuracy
110110

111-
`ToolCallAccuracyEvaluator` measures an agent's ability to select appropriate tools, extract, and process correct parameters from previous steps of the agentic workflow. It detects whether each tool call made is accurate (binary) and reports back the average scores, which can be interpreted as a passing rate across tool calls made.
111+
`ToolCallAccuracyEvaluator` measures the accuracy and efficiency of tool calls made by an agent in a run. It provides a 1-5 score based on:
112+
- the relevance and helpfulness of the tool invoked;
113+
- the correctness of parameters used in tool calls;
114+
- the counts of missing or excessive calls.
112115

113116
> [!NOTE]
114-
> `ToolCallAccuracyEvaluator` only supports Azure AI Agent's Function Tool evaluation, but doesn't support Built-in Tool evaluation. The agent messages must have at least one Function Tool actually called to be evaluated.
117+
> `ToolCallAccuracyEvaluator` only supports Azure AI Agent's Function Tool evaluation, but doesn't support Built-in Tool evaluation. The agent run must have at least one Function Tool call and no Built-in Tool calls made to be evaluated.
115118
116119
### Tool call accuracy example
117120

@@ -150,20 +153,35 @@ tool_call_accuracy(
150153

151154
### Tool call accuracy output
152155

153-
The numerical score (passing rate of correct tool calls) is 0-1 and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason and tool call detail fields can help you understand why the score is high or low.
156+
The numerical score is on a Likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason and tool call detail fields can help you understand why the score is high or low.
154157

155158
```python
156159
{
157-
"tool_call_accuracy": 1.0,
160+
"tool_call_accuracy": 5,
158161
"tool_call_accuracy_result": "pass",
159-
"tool_call_accuracy_threshold": 0.8,
160-
"per_tool_call_details": [
161-
{
162-
"tool_call_accurate": True,
163-
"tool_call_accurate_reason": "The input Data should get a Score of 1 because the TOOL CALL is directly relevant to the user's question about the weather in Seattle, includes appropriate parameters that match the TOOL DEFINITION, and the parameter values are correct and relevant to the user's query.",
164-
"tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ"
162+
"tool_call_accuracy_threshold": 3,
163+
"details": {
164+
"tool_calls_made_by_agent": 1,
165+
"correct_tool_calls_made_by_agent": 1,
166+
"per_tool_call_details": [
167+
{
168+
"tool_name": "fetch_weather",
169+
"total_calls_required": 1,
170+
"correct_calls_made_by_agent": 1,
171+
"correct_tool_percentage": 1.0,
172+
"tool_call_errors": 0,
173+
"tool_success_result": "pass"
174+
}
175+
],
176+
"excess_tool_calls": {
177+
"total": 0,
178+
"details": []
179+
},
180+
"missing_tool_calls": {
181+
"total": 0,
182+
"details": []
165183
}
166-
]
184+
}
167185
}
168186
```
169187

@@ -187,7 +205,7 @@ task_adherence(
187205

188206
### Task adherence output
189207

190-
The numerical score on a Likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
208+
The numerical score is on a Likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
191209

192210
```python
193211
{

articles/ai-foundry/concepts/resource-types.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.reviewer: deeikele
77
manager: scottpolly
88
author: sgilley
99
ms.author: sgilley
10-
ms.date: 05/18/2025
10+
ms.date: 07/22/2025
1111
ms.service: azure-ai-foundry
1212
ms.topic: concept-article
1313
ms.custom:
@@ -19,15 +19,15 @@ ms.custom:
1919

2020
An Azure resource is required to use and manage services in Azure. It defines the scope for configuring, securing, and monitoring the tools or capabilities you want to use—like AI models, agents, or storage.
2121

22-
AI Foundry Portal and SDK clients support multiple distinct Azure resource types, each designed to serve different development and operational needs. This article explains which use case requires which type.
22+
Azure AI Foundry portal and SDK clients support multiple distinct Azure resource types, each designed to serve different development and operational needs. This article explains which use case requires which type.
2323

2424
## Resource Types supported with AI Foundry
2525

26-
* **Azure AI Foundry** – The primary resource type for designing, deploying, and managing generative AI applications and agents. It provides access to agent service, models that are hosted using a serverless hosting model, evaluations, and Azure OpenAI service. This is the recommended resource type for most applications built in Azure AI Foundry.
26+
* **Azure AI Foundry** – The primary resource type for designing, deploying, and managing generative AI applications and agents. It provides access to agent service, models that are hosted using a serverless hosting model, evaluations, and Azure OpenAI service. Azure AI Foundry is the recommended resource type for most applications built in Azure AI Foundry.
2727

2828
Get started by [creating a first AI Foundry resource](../../ai-services/multi-service-resource.md?context=/azure/ai-foundry/context/context).
2929

30-
* **Azure AI Hub** – Use this resource type in combination with Azure AI Foundry to additionally access open-source model hosting and fine-tuning capabilities, as well as Azure Machine Learning capabilities. When you create an AI Hub, an Azure AI Foundry resource is automatically provisioned. Hub resources can be used in both AI Foundry Portal and Machine Learning Studio.
30+
* **Azure AI Hub** – Use this resource type in combination with Azure AI Foundry to additionally access open-source model hosting and fine-tuning capabilities, as well as Azure Machine Learning capabilities. When you create an AI Hub, an Azure AI Foundry resource is automatically provisioned. Hub resources can be used in both Azure AI Foundry portal and Machine Learning Studio.
3131

3232
* **Azure AI Search** – A resource used to index and retrieve data for grounding AI applications. It can be [connected](../how-to/connections-add.md) to Azure AI Foundry agents to enable retrieval-augmented generation (RAG) and semantic search experiences.
3333

articles/ai-foundry/how-to/create-resource-template.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: sgilley
66
author: sdgilley
77
manager: scottpolly
88
reviewer: deeikele
9-
ms.date: 05/18/2025
9+
ms.date: 07/22/2025
1010
ms.service: azure-ai-foundry
1111
ms.topic: quickstart-bicep
1212
ms.custom:
@@ -18,8 +18,6 @@ ms.custom:
1818

1919
# Quickstart: Create an Azure AI Foundry resource using a Bicep file
2020

21-
[!INCLUDE [hub-only-alt](../includes/uses-hub-only-alt.md)]
22-
2321
Use a [Microsoft Bicep](/azure/azure-resource-manager/bicep/overview) file (template) to create an [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs) resource. A template makes it easy to create resources as a single, coordinated operation. A Bicep file is a text document that defines the resources that are needed for a deployment. It might also specify deployment parameters. Parameters are used to provide input values when using the file to deploy resources.
2422

2523
## Prerequisites
@@ -64,14 +62,14 @@ Deploy the Bicep file using either the Azure CLI or Azure PowerShell.
6462
6563
```azurecli
6664
az group create --name exampleRG --location eastus
67-
az deployment group create --resource-group exampleRG --template-file main.bicep --parameters aiServicesName=myai aiProjectName=myai-proj
65+
az deployment group create --resource-group exampleRG --template-file main.bicep --parameters aiFoundryName=myai aiProjectName=myai-proj
6866
```
6967

7068
# [Azure PowerShell](#tab/powershell)
7169

7270
```azurepowershell
7371
New-AzResourceGroup -Name exampleRG -Location eastus
74-
New-AzResourceGroupDeployment -ResourceGroupName exampleRG -TemplateFile main.bicep -aiHubName myai -aiProjectName myai-proj
72+
New-AzResourceGroupDeployment -ResourceGroupName exampleRG -TemplateFile main.bicep -aiFoundryName myai -aiProjectName myai-proj
7573
```
7674

7775
---

0 commit comments

Comments
 (0)