Skip to content

Commit cfe2517

Browse files
authored
Merge pull request #5266 from MicrosoftDocs/main
5/29/2025 11:00 AM IST Publish
2 parents 64fe7dc + 5da4c2a commit cfe2517

File tree

6 files changed

+57
-51
lines changed

6 files changed

+57
-51
lines changed

articles/ai-services/language-service/overview.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: laujan
66
manager: nitinme
77
ms.service: azure-ai-language
88
ms.topic: overview
9-
ms.date: 03/05/2025
9+
ms.date: 05/28/2025
1010
ms.author: lajanuar
1111
---
1212

@@ -24,6 +24,7 @@ The Language service also provides several new features as well, which can eithe
2424
* Customizable, which means you train an AI model using our tools to fit your data specifically.
2525

2626
Language features are also utilized in [agent templates](https://github.com/azure-ai-foundry/foundry-samples/tree/main/samples/agent-catalog):
27+
2728
* [Intent routing agent](https://github.com/azure-ai-foundry/foundry-samples/tree/main/samples/agent-catalog/msft-agent-samples/foundry-agent-service-sdk/intent-routing-agent) detects user intent and provides exact answering. Perfect for deterministically intent routing and exact question answering with human controls.
2829
* [Exact question answering agent](https://github.com/azure-ai-foundry/foundry-samples/tree/main/samples/agent-catalog/msft-agent-samples/foundry-agent-service-sdk/exact-qna-agent) answers high-value predefined questions deterministically to ensure consistent and accurate responses.
2930

articles/ai-services/openai/how-to/batch.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ manager: nitinme
66
ms.service: azure-ai-openai
77
ms.custom: references_regions
88
ms.topic: how-to
9-
ms.date: 04/14/2025
9+
ms.date: 05/28/2025
1010
author: mrbullwinkle
1111
ms.author: mbullwin
1212
recommendations: false

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Understanding costs associated with provisioned throughput units (PTU)
33
description: Learn about provisioned throughput costs and billing in Azure OpenAI.
44
ms.service: azure-ai-openai
55
ms.topic: conceptual
6-
ms.date: 05/20/2025
6+
ms.date: 05/28/2025
77
manager: nitinme
88
author: aahill
99
ms.author: aahi
@@ -77,14 +77,14 @@ The amount of throughput (measured in tokens per minute or TPM) a deployment get
7777

7878
For example, for `gpt-4.1:2025-04-14`, 1 output token counts as 4 input tokens towards your utilization limit which matches the [pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). Older models use a different ratio and for a deeper understanding on how different ratios of input and output tokens impact the throughput your workload needs, see the [Azure OpenAI capacity calculator](https://ai.azure.com/resource/calculator).
7979

80-
|Topic| **gpt-4.1** | **gpt-4.1-mini** | **gpt-4.1-nano** | **o3** | **o3-mini** | **o1** | **gpt-4o** | **gpt-4o-mini** |
81-
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
82-
|Global & data zone provisioned minimum deployment|15|15| 15 | 15 |15|15|15|15|
83-
|Global & data zone provisioned scale increment|5|5| 5 | 5 |5|5|5|5|
84-
|Regional provisioned minimum deployment|50|25| 25 |50 | 25|25|50|25|
85-
|Regional provisioned scale increment|50|25| 25 | 50 | 25|50|50|25|
86-
|Input TPM per PTU|3,000|14,900| 59,400 | 600 | 2,500|230|2,500|37,000|
87-
|Latency Target Value|44 Tokens Per Second|50 Tokens Per Second| 50 Tokens Per Second | 40 Tokens Per Second | 66 Tokens Per Second |25 Tokens Per Second|25 Tokens Per Second|33 Tokens Per Second|
80+
|Topic| **o4-mini** | **gpt-4.1** | **gpt-4.1-mini** | **gpt-4.1-nano** | **o3** | **o3-mini** | **o1** | **gpt-4o** | **gpt-4o-mini** |
81+
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
82+
|Global & data zone provisioned minimum deployment| 15 | 15|15| 15 | 15 |15|15|15|15|
83+
|Global & data zone provisioned scale increment| 5 | 5|5| 5 | 5 |5|5|5|5|
84+
|Regional provisioned minimum deployment|25| 50|25| 25 |50 | 25|25|50|25|
85+
|Regional provisioned scale increment|25| 50|25| 25 | 50 | 25|50|50|25|
86+
|Input TPM per PTU|5,400 | 3,000|14,900| 59,400 | 600 | 2,500|230|2,500|37,000|
87+
|Latency Target Value| 66 Tokens Per Second | 40 Tokens Per Second|50 Tokens Per Second| 60 Tokens Per Second | 40 Tokens Per Second | 66 Tokens Per Second |25 Tokens Per Second|25 Tokens Per Second|33 Tokens Per Second|
8888

8989

9090
For a full list, see the [Azure OpenAI in Azure AI Foundry Models in Azure AI Foundry portal calculator](https://ai.azure.com/resource/calculator).

articles/ai-services/openai/includes/model-matrix/global-batch-datazone.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,17 @@ ms.date: 02/14/2025
1010
---
1111

1212

13-
| **Region** | **o3-mini**, **2025-01-31** | **gpt-4o**, **2024-08-06** | **gpt-4o-mini**, **2024-07-18** |
14-
|:-------------------|:---------------------------:|:--------------------------:|:-------------------------------:|
15-
| eastus ||||
16-
| eastus2 ||||
17-
| francecentral | - |||
18-
| germanywestcentral | - |||
19-
| northcentralus ||||
20-
| polandcentral | - |||
21-
| southcentralus ||||
22-
| swedencentral | - |||
23-
| westeurope | - |||
24-
| westus ||||
25-
| westus3 ||||
13+
| **Region** | **o4-mini**, **2025-04-16** | **gpt-4.1**, **2025-04-14** | **gpt-4.1-nano**, **2025-04-14** | **gpt-4.1-mini**, **2025-04-14** | **o3-mini**, **2025-01-31** | **gpt-4o**, **2024-08-06** | **gpt-4o-mini**, **2024-07-18** |
14+
|:-------------------|:---------------------------:|:---------------------------:|:--------------------------------:|:--------------------------------:|:---------------------------:|:--------------------------:|:-------------------------------:|
15+
| eastus ||||||||
16+
| eastus2 ||||||||
17+
| francecentral ||||| - |||
18+
| germanywestcentral ||||| - |||
19+
| northcentralus ||||||||
20+
| polandcentral ||||| - |||
21+
| southcentralus ||||||||
22+
| spaincentral ||||| - | - | - |
23+
| swedencentral ||||| - |||
24+
| westeurope ||||| - |||
25+
| westus ||||||||
26+
| westus3 ||||||||

0 commit comments

Comments
 (0)