Skip to content

Commit e4f19cb

Browse files
committed
2 parents 05d180b + ae208b3 commit e4f19cb

File tree

86 files changed

+5128
-2557
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

86 files changed

+5128
-2557
lines changed

articles/ai-foundry/agents/how-to/tools/bing-code-samples.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,8 +129,32 @@ if run.status == "failed":
129129
messages = project_client.agents.messages.list(thread_id=thread.id)
130130
for message in messages:
131131
print(f"Role: {message.role}, Content: {message.content}")
132+
```
133+
134+
## Optionally output the run steps used by the agent
135+
```python
136+
run_steps = project_client.agents.run_steps.list(thread_id=thread.id, run_id=run.id)
137+
for step in run_steps:
138+
print(f"Step {step['id']} status: {step['status']}")
139+
140+
# Check if there are tool calls in the step details
141+
step_details = step.get("step_details", {})
142+
tool_calls = step_details.get("tool_calls", [])
143+
144+
if tool_calls:
145+
print(" Tool calls:")
146+
for call in tool_calls:
147+
print(f" Tool Call ID: {call.get('id')}")
148+
print(f" Type: {call.get('type')}")
149+
150+
function_details = call.get("function", {})
151+
if function_details:
152+
print(f" Function name: {function_details.get('name')}")
153+
print() # add an extra newline between steps
154+
```
132155

133-
# Delete the agent when done
156+
## Delete the agent when done
157+
```python
134158
project_client.agents.delete_agent(agent.id)
135159
print("Deleted agent")
136160
```

articles/ai-foundry/concepts/fine-tuning-overview.md

Lines changed: 57 additions & 127 deletions
Large diffs are not rendered by default.

articles/ai-foundry/concepts/management-center.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
title: Management center overview
33
titleSuffix: Azure AI Foundry
44
description: "The management center in Azure AI Foundry portal provides a centralized hub for governance and management activities."
5-
author: Blackmist
6-
ms.author: larryfr
5+
author: sdgilley
6+
ms.author: sgilley
77
ms.service: azure-ai-foundry
88
ms.custom:
99
- ignite-2024

articles/ai-foundry/concepts/model-benchmarks.md

Lines changed: 42 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,14 @@ author: lgayhardt
2020

2121
Model leaderboards (preview) in Azure AI Foundry portal allow you to streamline the model selection process in the Azure AI Foundry [model catalog](../how-to/model-catalog-overview.md). The model leaderboards, backed by industry-standard benchmarks can help you to find the best model for your custom AI solution. From the model leaderboards section of the model catalog, you can [browse leaderboards](https://aka.ms/model-leaderboards) to compare available models as follows:
2222

23-
- [Quality, cost, and performance leaderboards](../how-to/benchmark-model-in-catalog.md#access-model-leaderboards) to quickly identify the model leaders along a single metric (quality, cost, or throughput);
23+
- [Quality, safety, cost, and performance leaderboards](../how-to/benchmark-model-in-catalog.md#access-model-leaderboards) to quickly identify the model leaders along a single metric (quality, safety, cost, or throughput);
2424
- [Trade-off charts](../how-to/benchmark-model-in-catalog.md#compare-models-in-the-trade-off-charts) to see how models perform on one metric versus another, such as quality versus cost;
2525
- [Leaderboards by scenario](../how-to/benchmark-model-in-catalog.md#view-leaderboards-by-scenario) to find the best leaderboards that suite your scenario.
2626

2727
Whenever you find a model to your liking, you can select it and zoom into the **Detailed benchmarking results** of the model within the model catalog. If satisfied with the model, you can deploy it, try it in the playground, or evaluate it on your data. The leaderboards support benchmarking across text language models (large language models (LLMs) and small language models (SLMs)) and embedding models.
2828

2929

30-
Model benchmarks assess LLMs and SLMs across the following categories: quality, performance, and cost. In addition, we assess the quality of embedding models using standard benchmarks. The leaderboards are updated regularly as better and more unsaturated benchmarks are onboarded, and as new models are added to the model catalog.
30+
Model benchmarks assess LLMs and SLMs across the following categories: quality, safety, cost, and throughput. In addition, we assess the quality of embedding models using standard benchmarks. The leaderboards are updated regularly as better and more unsaturated benchmarks are onboarded, and as new models are added to the model catalog.
3131

3232

3333
## Quality benchmarks of language models
@@ -40,7 +40,7 @@ Azure AI assesses the quality of LLMs and SLMs using accuracy scores from standa
4040

4141
Quality index is provided on a scale of zero to one. Higher values of quality index are better. The datasets included in quality index are:
4242

43-
| Dataset Name | Leaderboard Category |
43+
| Dataset Name | Leaderboard Scenario |
4444
|--------------------|----------------------|
4545
| arena_hard | QA |
4646
| bigbench_hard | Reasoning |
@@ -62,6 +62,45 @@ See more details in accuracy scores:
6262
Accuracy scores are provided on a scale of zero to one. Higher values are better.
6363

6464

65+
## Safety benchmarks of language models
66+
67+
To guide the selection of safety benchmarks for evaluation, we apply a structured filtering and validation process designed to ensure both relevance and rigor. A benchmark qualifies for onboarding if it addresses high-priority risks. For safety leaderboards, we look at different benchmarks that can be considered reliable enough to provide some signals on certain topics of interest as they relate to safety. We select [HarmBench](https://github.com/centerforaisafety/HarmBench) to proxy model safety, and organize scenario leaderboards as follows:
68+
69+
| Dataset Name | Leaderboard Scenario | Metric | Interpretation |
70+
|--------------------|----------------------|----------------------|----------------------|
71+
| HarmBench (standard) | Standard harmful behaviors | Attack Success Rate | Lower values means better robustness against attacks designed to illicit standard harmful content |
72+
| HarmBench (contextual) | Contextually harmful behaviors | Attack Success Rate | Lower values means better robustness against attacks designed to illicit contextually harmful content |
73+
| HarmBench (copyright violations) | Copyright violations | Attack Success Rate | Lower values means better robustness against attacks designed to illicit copyright violations|
74+
| WMDP | Knowledge in sensitive domains | Accuracy | Higher values denotes more knowledge in sensitive domains (cybersecurity, biosecurity, and chemical security) |
75+
| Toxigen | Ability to detect toxic content | F1 Score | Higher values means better ability to detect toxic content |
76+
77+
### Model harmful behaviors
78+
The [HarmBench](https://github.com/centerforaisafety/HarmBench) benchmark measures model harmful behaviors and includes prompts to illicit harmful behavior from model. As it relates to safety, the benchmark covers 7 semantic categories of behavior:
79+
- Cybercrime & Unauthorized Intrusion
80+
- Chemical & Biological Weapons/Drugs
81+
- Copyright Violations
82+
- Misinformation & Disinformation
83+
- Harassment & Bullying
84+
- Illegal Activities
85+
- General Harm
86+
87+
These 7 categories can be summarized into 3 functional categories
88+
- standard harmful behaviors
89+
- contextually harmful behaviors
90+
- copyright violations
91+
92+
Each functional category is featured in a separate scenario leaderboard. We use direct prompts from HarmBench (no attacks) and HarmBench evaluators to calculate Attack Success Rate (ASR). Lower ASR values means safer models. We do not explore any attack strategy for evaluation, and model benchmarking is performed with Azure AI Content Safety Filter turned off.
93+
94+
95+
### Model ability to detect toxic content
96+
[Toxigen](https://github.com/microsoft/TOXIGEN) is a large-scale machine-generated dataset for adversarial and implicit hate speech detection. It contains implicitly toxic and benign sentences mentioning 13 minority groups. We use the annotated samples from Toxigen for evaluation and calculate F1 scores to measure classification performance. Scoring higher on this dataset means that a model is better at detecting toxic content. Model benchmarking is performed with Azure AI Content Safety Filter turned off.
97+
98+
### Model knowledge in sensitive domains
99+
The [Weapons of Mass Destruction Proxy](https://github.com/centerforaisafety/wmdp) (WMDP) benchmark measures model knowledge of in sensitive domains including biosecurity, cybersecurity, and chemical security. The leaderboard uses average accuracy scores across cybersecurity, biosecurity, and chemical security. A higher WMDP accuracy score denotes more knowledge of dangerous capabilities (worse behavior from a safety standpoint). Model benchmarking is performed with the default Azure AI Content Safety filters on. These safety filters detect and block content harm in violence, self-harm, sexual, hate and unfairness, but don't target categories in cybersecurity, biosecurity, and chemical security.
100+
101+
### Limitations of safety benchmarks
102+
We understand and acknowledge that safety is a complex topic and has several dimensions. No single current open-source benchmarks can test or represent the full safety of a system in different scenarios. Additionally, most of these benchmarks suffer from saturation, or misalignment between benchmark design and the risk definition, can lack clear documentation on how the target risks are conceptualized and operationalized, making it difficult to assess whether the benchmark accurately captures the nuances of the risks. This limitation can lead to either overestimating or underestimating model performance in real-world safety scenarios.
103+
65104
## Performance benchmarks of language models
66105

67106
Performance metrics are calculated as an aggregate over 14 days, based on 24 trails (two requests per trail) sent daily with a one-hour interval between every trail. The following default parameters are used for each request to the model endpoint:

articles/ai-foundry/concepts/models-featured.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ The following table lists the Cohere models that you can inference via the Found
7676
| [Cohere-command-r-08-2024](https://ai.azure.com/explore/models/Cohere-command-r-08-2024/version/1/registry/azureml-cohere) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
7777
| [Cohere-command-r-plus](https://ai.azure.com/explore/models/Cohere-command-r-plus/version/1/registry/azureml-cohere) <br> (deprecated) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
7878
| [Cohere-command-r](https://ai.azure.com/explore/models/Cohere-command-r/version/1/registry/azureml-cohere) <br> (deprecated)| [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
79-
| [Cohere-embed-4](https://aka.ms/aistudio/landing/cohere-embed-4) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** image, text <br /> - **Output:** image, text (128,000 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** image, text |
79+
| [Cohere-embed-v-4](https://aka.ms/aistudio/landing/cohere-embed-4) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** image, text <br /> - **Output:** image, text (128,000 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** image, text |
8080
| [Cohere-embed-v3-english](https://ai.azure.com/explore/models/Cohere-embed-v3-english/version/1/registry/azureml-cohere) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** text (512 tokens) <br /> - **Output:** Vector (1,024 dim.) |
8181
| [Cohere-embed-v3-multilingual](https://ai.azure.com/explore/models/Cohere-embed-v3-multilingual/version/1/registry/azureml-cohere) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** text (512 tokens) <br /> - **Output:** Vector (1,024 dim.) |
8282

articles/ai-foundry/concepts/resource-types.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22
title: Choose an Azure resource type for AI foundry
33
titleSuffix: Azure AI Foundry
44
description: Learn about the supported Azure resource types in Azure AI Foundry portal.
5-
author: deeikele
6-
ms.author: deeikele
5+
reviewer: deeikele
6+
ms.reviewer: deeikele
77
manager: scottpolly
8-
reviewer: larryfr
9-
ms.reviewer: larryfr
8+
author: sgilley
9+
ms.author: sgilley
1010
ms.date: 05/18/2025
1111
ms.service: azure-ai-foundry
1212
ms.topic: concept-article

articles/ai-foundry/foundry-local/how-to/how-to-integrate-with-inference-sdks.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,12 @@ Foundry Local integrates with various inferencing SDKs - such as OpenAI, Azure O
2828
::: zone pivot="programming-language-javascript"
2929
[!INCLUDE [JavaScript](../includes/integrate-examples/javascript.md)]
3030
::: zone-end
31+
::: zone pivot="programming-language-csharp"
32+
[!INCLUDE [JavaScript](../includes/integrate-examples/csharp.md)]
33+
::: zone-end
34+
::: zone pivot="programming-language-rust"
35+
[!INCLUDE [JavaScript](../includes/integrate-examples/rust.md)]
36+
::: zone-end
3137

3238
## Next steps
3339

articles/ai-foundry/foundry-local/how-to/how-to-use-langchain-with-foundry-local.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.reviewer: eneros
1111
ms.author: eneros
1212
author: eneros
1313
ms.custom: build-2025
14-
zone_pivot_groups: foundry-local-sdk
14+
zone_pivot_groups: foundry-local-langchain
1515
#customer intent: As a developer, I want to get started with Foundry Local so that I can run AI models locally.
1616
---
1717

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
ms.service: azure-ai-foundry
3+
ms.topic: include
4+
ms.date: 05/02/2025
5+
ms.author: samkemp
6+
author: samuel100
7+
---
8+
9+
## Create project
10+
11+
Create a new C# project and navigate into it:
12+
13+
```bash
14+
dotnet new console -n hello-foundry-local
15+
cd hello-foundry-local
16+
```
17+
18+
### Install NuGet Packages
19+
20+
Install the following NuGet packages into your project folder:
21+
22+
```bash
23+
dotnet add package Microsoft.AI.Foundry.Local --version 0.1.0
24+
dotnet add package OpenAI --version 2.2.0-beta.4
25+
```
26+
27+
## Use OpenAI SDK with Foundry Local
28+
29+
The following example demonstrates how to use the OpenAI SDK with Foundry Local. The code initializes the Foundry Local service, loads a model, and generates a response using the OpenAI SDK.
30+
31+
Copy-and-paste the following code into a C# file named `Program.cs`:
32+
33+
```csharp
34+
using Microsoft.AI.Foundry.Local;
35+
using OpenAI;
36+
using OpenAI.Chat;
37+
using System.ClientModel;
38+
using System.Diagnostics.Metrics;
39+
40+
var alias = "phi-3.5-mini";
41+
42+
var manager = await FoundryLocalManager.StartModelAsync(aliasOrModelId: alias);
43+
44+
var model = await manager.GetModelInfoAsync(aliasOrModelId: alias);
45+
ApiKeyCredential key = new ApiKeyCredential(manager.ApiKey);
46+
OpenAIClient client = new OpenAIClient(key, new OpenAIClientOptions
47+
{
48+
Endpoint = manager.Endpoint
49+
});
50+
51+
var chatClient = client.GetChatClient(model?.ModelId);
52+
53+
var completionUpdates = chatClient.CompleteChatStreaming("Why is the sky blue'");
54+
55+
Console.Write($"[ASSISTANT]: ");
56+
foreach (var completionUpdate in completionUpdates)
57+
{
58+
if (completionUpdate.ContentUpdate.Count > 0)
59+
{
60+
Console.Write(completionUpdate.ContentUpdate[0].Text);
61+
}
62+
}
63+
```
64+
65+
Run the code using the following command:
66+
67+
```bash
68+
dotnet run
69+
```
70+
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
ms.service: azure-ai-foundry
3+
ms.topic: include
4+
ms.date: 05/02/2025
5+
ms.author: samkemp
6+
author: samuel100
7+
---
8+
9+
## Create project
10+
11+
Create a new Rust project and navigate into it:
12+
13+
```bash
14+
cargo new hello-foundry-local
15+
cd hello-foundry-local
16+
```
17+
18+
### Install crates
19+
20+
Install the following Rust crates using Cargo:
21+
22+
```bash
23+
cargo add foundry-local anyhow env_logger serde_json
24+
cargo add reqwest --features json
25+
cargo add tokio --features full
26+
```
27+
28+
## Update the `main.rs` file
29+
30+
The following example demonstrates how to inference using a request to the Foundry Local service. The code initializes the Foundry Local service, loads a model, and generates a response using the `reqwest` library.
31+
32+
Copy-and-paste the following code into the Rust file named `main.rs`:
33+
34+
```rust
35+
use foundry_local::FoundryLocalManager;
36+
use anyhow::Result;
37+
38+
#[tokio::main]
39+
async fn main() -> Result<()> {
40+
// Create a FoundryLocalManager instance with default options
41+
let mut manager = FoundryLocalManager::builder()
42+
.alias_or_model_id("qwen2.5-0.5b") // Specify the model to use
43+
.bootstrap(true) // Start the service if not running
44+
.build()
45+
.await?;
46+
47+
// Use the OpenAI compatible API to interact with the model
48+
let client = reqwest::Client::new();
49+
let endpoint = manager.endpoint()?;
50+
let response = client.post(format!("{}/chat/completions", endpoint))
51+
.header("Content-Type", "application/json")
52+
.header("Authorization", format!("Bearer {}", manager.api_key()))
53+
.json(&serde_json::json!({
54+
"model": manager.get_model_info("qwen2.5-0.5b", true).await?.id,
55+
"messages": [{"role": "user", "content": "What is the golden ratio?"}],
56+
}))
57+
.send()
58+
.await?;
59+
60+
let result = response.json::<serde_json::Value>().await?;
61+
println!("{}", result["choices"][0]["message"]["content"]);
62+
63+
Ok(())
64+
}
65+
```
66+
67+
Run the code using the following command:
68+
69+
```bash
70+
cargo run
71+
```
72+

0 commit comments

Comments
 (0)