Skip to content

Commit 75f819b

Browse files
authored
Merge branch 'release-build-ai-foundry' into agent-metric-updates
2 parents 4c2f86e + bbf8fc6 commit 75f819b

File tree

135 files changed

+1813
-1675
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

135 files changed

+1813
-1675
lines changed

articles/ai-foundry/.openpublishing.redirection.ai-studio.json

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1152,6 +1152,21 @@
11521152
"source_path_from_root": "/articles/ai-foundry/concepts/evaluation-metrics-built-in.md",
11531153
"redirect_url": "/azure/ai-foundry/concepts/observability",
11541154
"redirect_document_id": false
1155+
},
1156+
{
1157+
"source_path_from_root": "/articles/ai-foundry/concepts/trace.md",
1158+
"redirect_url": "/azure/ai-foundry/how-to/develop/trace-application",
1159+
"redirect_document_id": false
1160+
},
1161+
{
1162+
"source_path_from_root": "/articles/ai-foundry/how-to/develop/trace-local-sdk.md",
1163+
"redirect_url": "/azure/ai-foundry/how-to/develop/trace-application",
1164+
"redirect_document_id": true
1165+
},
1166+
{
1167+
"source_path_from_root": "/articles/ai-foundry/how-to/develop/visualize-traces.md",
1168+
"redirect_url": "/azure/ai-foundry/how-to/develop/trace-application#visualize-your-traces",
1169+
"redirect_document_id": false
11551170
}
11561171
]
11571172
}

articles/ai-foundry/at-foundry/ask-at-foundry.md

Lines changed: 0 additions & 66 deletions
This file was deleted.

articles/ai-foundry/concepts/content-filtering.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ author: PatrickFarley
2020
[Azure AI Foundry](https://ai.azure.com) includes a content filtering system that works alongside core models and image generation models.
2121

2222
> [!IMPORTANT]
23-
> The content filtering system isn't applied to prompts and completions processed by the Whisper model in Azure OpenAI Service. Learn more about the [Whisper model in Azure OpenAI](../../ai-services/openai/concepts/models.md).
23+
> The content filtering system isn't applied to prompts and completions processed by the Whisper model in Azure OpenAI in Azure AI Foundry Models. Learn more about the [Whisper model in Azure OpenAI](../../ai-services/openai/concepts/models.md).
2424
2525
## How it works
2626

articles/ai-foundry/concepts/deployments-overview.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,15 @@ The model catalog in Azure AI Foundry portal is the hub to discover and use a wi
1919

2020
Deployment options vary depending on the model offering:
2121

22-
* **Azure OpenAI models:** The latest OpenAI models that have enterprise features from Azure with flexible billing options.
23-
* **Models-as-a-Service models:** These models don't require compute quota from your subscription and are billed per token in a pay-as-you-go fashion.
22+
* **Azure OpenAI in Azure AI Foundry Models:** The latest OpenAI models that have enterprise features from Azure with flexible billing options.
23+
* **Standard deployment:** These models don't require compute quota from your subscription and are billed per token in a pay-as-you-go fashion.
2424
* **Open and custom models:** The model catalog offers access to a large variety of models across modalities, including models of open access. You can host open models in your own subscription with a managed infrastructure, virtual machines, and the number of instances for capacity management.
2525

2626
Azure AI Foundry offers four different deployment options:
2727

28-
|Name | Azure OpenAI service | Azure AI model inference | Serverless API | Managed compute |
28+
|Name | Azure OpenAI | Azure AI model inference | Standard deployment | Managed compute |
2929
|-------------------------------|----------------------|-------------------|----------------|-----------------|
30-
| Which models can be deployed? | [Azure OpenAI models](../../ai-services/openai/concepts/models.md) | [Azure OpenAI models and Models-as-a-Service](../../ai-foundry/model-inference/concepts/models.md) | [Models-as-a-Service](../how-to/model-catalog-overview.md#content-safety-for-models-deployed-via-serverless-apis) | [Open and custom models](../how-to/model-catalog-overview.md#availability-of-models-for-deployment-as-managed-compute) |
30+
| Which models can be deployed? | [Azure OpenAI models](../../ai-services/openai/concepts/models.md) | [Azure OpenAI models and Standard deployment](../../ai-foundry/model-inference/concepts/models.md) | [Standard deployment](../how-to/model-catalog-overview.md#content-safety-for-models-deployed-via-serverless-apis) | [Open and custom models](../how-to/model-catalog-overview.md#availability-of-models-for-deployment-as-managed-compute) |
3131
| Deployment resource | Azure OpenAI resource | Azure AI services resource | AI project resource | AI project resource |
3232
| Requires Hubs/Projects | No | No | Yes | Yes |
3333
| Data processing options | Regional <br /> Data-zone <br /> Global | Global | Regional | Regional |
@@ -37,7 +37,7 @@ Azure AI Foundry offers four different deployment options:
3737
| Key-less authentication | Yes | Yes | No | No |
3838
| Best suited when | You're planning to use only OpenAI models | You're planning to take advantage of the flagship models in Azure AI catalog, including OpenAI. | You're planning to use a single model from a specific provider (excluding OpenAI). | If you plan to use open models and you have enough compute quota available in your subscription. |
3939
| Billing bases | Token usage & [provisioned throughput units](../../ai-services/openai/concepts/provisioned-throughput.md) | Token usage | Token usage<sup>1</sup> | Compute core hours<sup>2</sup> |
40-
| Deployment instructions | [Deploy to Azure OpenAI Service](../how-to/deploy-models-openai.md) | [Deploy to Azure AI model inference](../model-inference/how-to/create-model-deployments.md) | [Deploy to Serverless API](../how-to/deploy-models-serverless.md) | [Deploy to Managed compute](../how-to/deploy-models-managed.md) |
40+
| Deployment instructions | [Deploy to Azure OpenAI](../how-to/deploy-models-openai.md) | [Deploy to Azure AI model inference](../model-inference/how-to/create-model-deployments.md) | [Deploy to Standard deployment](../how-to/deploy-models-serverless.md) | [Deploy to Managed compute](../how-to/deploy-models-managed.md) |
4141

4242
<sup>1</sup> A minimal endpoint infrastructure is billed per minute. You aren't billed for the infrastructure that hosts the model in pay-as-you-go. After you delete the endpoint, no further charges accrue.
4343

@@ -54,11 +54,11 @@ Azure AI Foundry encourages you to explore various deployment options and choose
5454

5555
* When you're looking to use a specific model:
5656

57-
* If you're interested in Azure OpenAI models, use the Azure OpenAI Service. This option is designed for Azure OpenAI models and offers a wide range of capabilities for them.
57+
* If you're interested in Azure OpenAI models, use Azure OpenAI in Foundry Models. This option is designed for Azure OpenAI models and offers a wide range of capabilities for them.
5858

59-
* If you're interested in a particular model from Models-as-a-Service, and you don't expect to use any other type of model, use [Serverless API endpoints](../how-to/deploy-models-serverless.md). Serverless endpoints allow deployment of a single model under a unique set of endpoint URL and keys.
59+
* If you're interested in a particular model from serverless pay per token offer, and you don't expect to use any other type of model, use [Standard deployment](../how-to/deploy-models-serverless.md). Standard deployments allow deployment of a single model under a unique set of endpoint URL and keys.
6060

61-
* When your model isn't available in Models-as-a-Service and you have compute quota available in your subscription, use [Managed Compute](../how-to/deploy-models-managed.md), which supports deployment of open and custom models. It also allows a high level of customization of the deployment inference server, protocols, and detailed configuration.
61+
* When your model isn't available in standard deployment and you have compute quota available in your subscription, use [Managed Compute](../how-to/deploy-models-managed.md), which supports deployment of open and custom models. It also allows a high level of customization of the deployment inference server, protocols, and detailed configuration.
6262

6363

6464
## Related content

articles/ai-foundry/concepts/encryption-keys-portal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ Customer-managed key encryption is configured via Azure portal in a similar way
127127

128128
* The customer-managed key for encryption can only be updated to keys in the same Azure Key Vault instance.
129129
* After deployment, your [!INCLUDE [fdp](../includes/fdp-project-name.md)] can't switch from Microsoft-managed keys to customer-managed keys or vice versa.
130-
* Azure charges will continue to accrue during the soft delete retention period.
130+
* Azure charges for the AI Foundry resource will continue to accrue during the soft delete retention period. Charges for projects don't continue to accrue during the soft delete retention period.
131131

132132
::: zone-end
133133

articles/ai-foundry/concepts/evaluation-evaluators/azure-openai-graders.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@ ms.author: lagayhar
1111
author: lgayhardt
1212
---
1313

14-
# Azure OpenAI Graders
14+
# Azure OpenAI Graders (preview)
15+
16+
[!INCLUDE [feature-preview](../../includes/feature-preview.md)]
1517

1618
The Azure OpenAI Graders are a new set of evaluation graders available in the Azure AI Foundry SDK, aimed at evaluating the performance of AI models and their outputs. These graders including [Label grader](#label-grader), [String checker](#string-checker), [Text similarity](#text-similarity), and [General grader](#general-grader) can be run locally or remotely. Each grader serves a specific purpose in assessing different aspects of AI model/model outputs.
1719

@@ -209,17 +211,17 @@ The grader also returns a metric indicating the overall dataset pass rate.
209211

210212
## General grader
211213

212-
Advanced users have the capability to import or define a custom grader and integrate it into the Azure OpenAI general grader. This allows for evaluations to be performed based on specific areas of interest aside from the existing Azure OpenAI graders. Following is an example to import the OpenAI `EvalStringCheckGrader` and construct it to be ran as an Azure OpenAI general grader on Foundry SDK.
214+
Advanced users have the capability to import or define a custom grader and integrate it into the AOAI general grader. This allows for evaluations to be performed based on specific areas of interest aside from the existing AOAI graders. Following is an example to import the OpenAI `StringCheckGrader` and construct it to be ran as a AOAI general grader on Foundry SDK.
213215

214216
### Example
215217

216218
```python
217-
from openai.types.eval_string_check_grader import EvalStringCheckGrader
219+
from openai.types.graders import StringCheckGrader
218220
from azure.ai.evaluation import AzureOpenAIGrader
219-
221+
220222
# Define an string check grader config directly using the OAI SDK
221223
# Evaluation criteria: Pass if query column contains "Northwind"
222-
oai_string_check_grader = EvalStringCheckGrader(
224+
oai_string_check_grader = StringCheckGrader(
223225
input="{{item.query}}",
224226
name="contains hello",
225227
operation="like",
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
---
2+
title: Custom evaluators
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to create custom evaluators for your AI applications using code-based or prompt-based approaches.
5+
manager: scottpolly
6+
ms.service: azure-ai-foundry
7+
ms.topic: reference
8+
ms.date: 05/19/2025
9+
ms.reviewer: mithigpe
10+
ms.author: lagayhar
11+
author: lgayhardt
12+
---
13+
14+
# Custom evaluators
15+
16+
Built-in evaluators are great out of the box to start evaluating your application's generations. However you might want to build your own code-based or prompt-based evaluator to cater to your specific evaluation needs.
17+
18+
## Code-based evaluators
19+
20+
Sometimes a large language model isn't needed for certain evaluation metrics. This is when code-based evaluators can give you the flexibility to define metrics based on functions or callable class. You can build your own code-based evaluator, for example, by creating a simple Python class that calculates the length of an answer in `answer_length.py` under directory `answer_len/`:
21+
22+
### Code-based evaluator example: Answer length
23+
24+
```python
25+
class AnswerLengthEvaluator:
26+
def __init__(self):
27+
pass
28+
# A class is made a callable my implementing the special method __call__
29+
def __call__(self, *, answer: str, **kwargs):
30+
return {"answer_length": len(answer)}
31+
```
32+
33+
Then run the evaluator on a row of data by importing a callable class:
34+
35+
```python
36+
from answer_len.answer_length import AnswerLengthEvaluator
37+
38+
answer_length_evaluator = AnswerLengthEvaluator()
39+
answer_length = answer_length_evaluator(answer="What is the speed of light?")
40+
```
41+
42+
### Code-based evaluator output: Answer length
43+
44+
```python
45+
{"answer_length":27}
46+
```
47+
48+
## Prompt-based evaluators
49+
50+
To build your own prompt-based large language model evaluator or AI-assisted annotator, you can create a custom evaluator based on a **Prompty** file. Prompty is a file with `.prompty` extension for developing prompt template. The Prompty asset is a markdown file with a modified front matter. The front matter is in YAML format that contains many metadata fields that define model configuration and expected inputs of the Prompty. Let's create a custom evaluator `FriendlinessEvaluator` to measure friendliness of a response.
51+
52+
### Prompt-based evaluator example: Friendliness evaluator
53+
54+
First, create a `friendliness.prompty` file that describes the definition of the friendliness metric and its grading rubric:
55+
56+
```markdown
57+
---
58+
name: Friendliness Evaluator
59+
description: Friendliness Evaluator to measure warmth and approachability of answers.
60+
model:
61+
api: chat
62+
parameters:
63+
temperature: 0.1
64+
response_format: { "type": "json" }
65+
inputs:
66+
response:
67+
type: string
68+
outputs:
69+
score:
70+
type: int
71+
explanation:
72+
type: string
73+
---
74+
75+
system:
76+
Friendliness assesses the warmth and approachability of the answer. Rate the friendliness of the response between one to five stars using the following scale:
77+
78+
One star: the answer is unfriendly or hostile
79+
80+
Two stars: the answer is mostly unfriendly
81+
82+
Three stars: the answer is neutral
83+
84+
Four stars: the answer is mostly friendly
85+
86+
Five stars: the answer is very friendly
87+
88+
Please assign a rating between 1 and 5 based on the tone and demeanor of the response.
89+
90+
**Example 1**
91+
generated_query: I just dont feel like helping you! Your questions are getting very annoying.
92+
output:
93+
{"score": 1, "reason": "The response is not warm and is resisting to be providing helpful information."}
94+
**Example 2**
95+
generated_query: I'm sorry this watch is not working for you. Very happy to assist you with a replacement.
96+
output:
97+
{"score": 5, "reason": "The response is warm and empathetic, offering a resolution with care."}
98+
99+
100+
**Here the actual conversation to be scored:**
101+
generated_query: {{response}}
102+
output:
103+
```
104+
105+
Then create a class `FriendlinessEvaluator` to load the Prompty file and process the outputs with json format:
106+
107+
```python
108+
import os
109+
import json
110+
import sys
111+
from promptflow.client import load_flow
112+
113+
114+
class FriendlinessEvaluator:
115+
def __init__(self, model_config):
116+
current_dir = os.path.dirname(__file__)
117+
prompty_path = os.path.join(current_dir, "friendliness.prompty")
118+
self._flow = load_flow(source=prompty_path, model={"configuration": model_config})
119+
120+
def __call__(self, *, response: str, **kwargs):
121+
llm_response = self._flow(response=response)
122+
try:
123+
response = json.loads(llm_response)
124+
except Exception as ex:
125+
response = llm_response
126+
return response
127+
```
128+
129+
Now, you can create your own Prompty-based evaluator and run it on a row of data:
130+
131+
```python
132+
from friendliness.friend import FriendlinessEvaluator
133+
134+
friendliness_eval = FriendlinessEvaluator(model_config)
135+
136+
friendliness_score = friendliness_eval(response="I will not apologize for my behavior!")
137+
```
138+
139+
### Prompt-based evaluator output: Friendliness evaluator
140+
141+
```python
142+
{
143+
'score': 1,
144+
'reason': 'The response is hostile and unapologetic, lacking warmth or approachability.'
145+
}
146+
```
147+
148+
## Related content
149+
150+
- Learn [how to run batch evaluation on a dataset](../../how-to/develop/evaluate-sdk.md#local-evaluation-on-datasets) and [how to run batch evaluation on a target](../../how-to/develop/evaluate-sdk.md#local-evaluation-on-a-target).

0 commit comments

Comments
 (0)