You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/agents/concepts/model-region-support.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ Agents are powered by a diverse set of Azure OpenAI models with different capabi
19
19
-**Standard** is offered with a global deployment option, routing traffic globally to provide higher throughput.
20
20
-**Provisioned** is also offered with a global deployment option, allowing customers to purchase and deploy provisioned throughput units across Azure global infrastructure.
21
21
22
-
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see [deployment types guide](../../openai/how-to/deployment-types.md).
22
+
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see [deployment types guide](../../foundry-models/concepts/deployment-types.md).
23
23
24
24
## Available models
25
25
@@ -130,4 +130,4 @@ Azure AI Foundry Agent Service supports the following Azure OpenAI models in the
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/concepts/provisioned-migration.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,7 +34,7 @@ This article is intended for existing users of the provisioned throughput offeri
34
34
|Self-service quota requests | Request quota increases without engaging the sales team – many can be autoapproved. |
35
35
|Default provisioned-managed quota in many regions | Get started quickly without having to first request quota. |
36
36
|Transparent information on real-time capacity availability + New deployment flow | Reduced negotiation around availability accelerates time-to-market. |
37
-
| Data zone provisioned deployments | Allows you to leverage Azure's global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. For more information, see the [deployment types](../how-to/deployment-types.md#data-zone-provisioned) article. |
37
+
| Data zone provisioned deployments | Allows you to leverage Azure's global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. For more information, see the [deployment types](../../foundry-models/concepts/deployment-types.md#data-zone-provisioned) article. |
38
38
39
39
### New hourly/reservation commercial model
40
40
@@ -45,7 +45,7 @@ This article is intended for existing users of the provisioned throughput offeri
45
45
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota. |
46
46
| Flexible choice of payment model for existing provisioned customers | Customers with commitments can stay on the commitment model until the end of life of the currently supported models, and can choose to migrate existing commitments to hourly/reservations via managed process. We recommend migrating to hourly/ reservations to take advantage of term discounts and to work with the latest models. |
47
47
| Supports latest model generations | The latest models are available only on hourly/ reservations in provisioned offering. |
48
-
| Differentiated pricing | Greater flexibility and control of pricing and performance. In December 2024, we introduced differentiated hourly pricing across [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned), and [regional provisioned](../how-to/deployment-types.md#regional-provisioned) deployment types with the option to purchase [Azure Reservations](#new-azure-reservations-for-global-and-data-zone-provisioned-deployments) to support additional discounts. For more information on the hourly price for each provisioned deployment type, see the [Pricing details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) page. |
48
+
| Differentiated pricing | Greater flexibility and control of pricing and performance. In December 2024, we introduced differentiated hourly pricing across [global provisioned](../../foundry-models/concepts/deployment-types.md#global-provisioned), [data zone provisioned](../../foundry-models/concepts/deployment-types.md#data-zone-provisioned), and [regional provisioned](../../foundry-models/concepts/deployment-types.md#regional-provisioned) deployment types with the option to purchase [Azure Reservations](#new-azure-reservations-for-global-and-data-zone-provisioned-deployments) to support additional discounts. For more information on the hourly price for each provisioned deployment type, see the [Pricing details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) page. |
> * You can take advantage of more cost savings when you buy [Microsoft Azure AI Foundry Provisioned Throughput reservations](/azure/cost-management-billing/reservations/azure-openai#buy-a-microsoft-azure-openai-service-reservation).
34
-
> * Provisioned throughput is available as the following deployment types: [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned) and [regional provisioned](../how-to/deployment-types.md#regional-provisioned).
34
+
> * Provisioned throughput is available as the following deployment types: [global provisioned](../../foundry-models/concepts/deployment-types.md#global-provisioned), [data zone provisioned](../../foundry-models/concepts/deployment-types.md#data-zone-provisioned) and [regional provisioned](../../foundry-models/concepts/deployment-types.md#regional-provisioned).
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/batch.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -92,7 +92,7 @@ The following aren't currently supported:
92
92
### Batch deployment
93
93
94
94
> [!NOTE]
95
-
> In the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) the batch deployment types will appear as `Global-Batch` and `Data Zone Batch`. To learn more about Azure OpenAI deployment types, see our [deployment types guide](../how-to/deployment-types.md).
95
+
> In the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) the batch deployment types will appear as `Global-Batch` and `Data Zone Batch`. To learn more about Azure OpenAI deployment types, see our [deployment types guide](../../foundry-models/concepts/deployment-types.md).
96
96
97
97
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure AI Foundry portal with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
98
98
@@ -246,5 +246,5 @@ When a job failure occurs, you'll find details about the failure in the `errors`
246
246
247
247
## See also
248
248
249
-
* Learn more about Azure OpenAI [deployment types](./deployment-types.md)
249
+
* Learn more about Azure OpenAI [deployment types](../../foundry-models/concepts/deployment-types.md)
250
250
* Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md)
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/deployment-types.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ ms.custom:
12
12
- build-2025
13
13
---
14
14
15
-
# Deployment types for Azure AI Foundry Models
15
+
# Understanding Deployment types for Azure AI Foundry Models
16
16
17
17
Azure AI Foundry makes models available by using the model deployment concept in Azure AI Foundry Services (formerly known as Azure AI Services). Model deployments are also Azure resources and, when created, give access to a given model under certain configurations. Such a configuration includes the infrastructure required to process the requests.
18
18
@@ -33,7 +33,7 @@ For standard deployments, there are three deployment-type options to choose from
33
33
34
34
### Global deployments
35
35
36
-
Global deployments use the global infrastructure of Azure to dynamically route customer traffic to the datacenter with the best availability for the customer's inference requests. This means that global offers the highest initial throughput limits and best model availability, but still provides our uptime SLA and low latency. For high-volume workloads above the specified usage tiers on Standard and Global Standard, you might experience increased latency variation. For customers that require the lower latency variance at large workload usage, we recommend using our provisioned deployment types.
36
+
Global deployments use the global infrastructure of Azure to dynamically route customer traffic to the datacenter with the best availability for the customer's inference requests. This means that global offers the highest initial throughput limits and best model availability, but still provides our uptime [SLA](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services) and low latency. For high-volume workloads above the specified usage tiers on Standard and Global Standard, you might experience increased latency variation. For customers that require the lower latency variance at large workload usage, we recommend using our provisioned deployment types.
37
37
38
38
Our global deployments are the first location for all new models and features. Depending on call volume, customers with large volume and low latency variance requirements should consider our provisioned deployment types.
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/fine-tune-test.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ After you've fine-tuned a model, you may want to test its quality via the Chat C
20
20
A Developer Tier deployment allows you to deploy your new model without the hourly hosting fee incurred by Standard or Global deployments. The only charges incurred are per-token. Consult the [pricing page](https://aka.ms/aoaipricing) for the most up-to-date pricing.
21
21
22
22
> [!IMPORTANT]
23
-
> Developer Tier offers no availability SLA and no [data residency](https://aka.ms/data-residency) guarantees. If you require an SLA or data residency, choose an alternative [deployment type](./deployment-types.md) for testing your model.
23
+
> Developer Tier offers no availability SLA and no [data residency](https://aka.ms/data-residency) guarantees. If you require an SLA or data residency, choose an alternative [deployment type](../../foundry-models/concepts/deployment-types.md) for testing your model.
24
24
>
25
25
> Developer Tier deployments have a fixed lifetime of **24 hours**. Learn more [below](#clean-up-your-deployment) about the deployment lifecycle.
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/fine-tuning-deploy.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ Once your model is fine-tuned, you can deploy the model and can use it in your o
18
18
19
19
When you deploy the model, you make the model available for inferencing, and that incurs an hourly hosting charge. Fine-tuned models, however, can be stored in Azure AI Foundry at no cost until you're ready to use them.
20
20
21
-
Azure OpenAI provides choices of deployment types for fine-tuned models on the hosting structure that fits different business and usage patterns: **Standard**, **Global Standard** (preview) and **Provisioned Throughput** (preview). Learn more about [deployment types for fine-tuned models](#deployment-types) and the [concepts of all deployment types](./deployment-types.md).
21
+
Azure OpenAI provides choices of deployment types for fine-tuned models on the hosting structure that fits different business and usage patterns: **Standard**, **Global Standard** (preview) and **Provisioned Throughput** (preview). Learn more about [deployment types for fine-tuned models](#deployment-types) and the [concepts of all deployment types](../../foundry-models/concepts/deployment-types.md).
22
22
23
23
## Deploy your fine-tuned model
24
24
@@ -362,7 +362,7 @@ Azure OpenAI fine-tuning supports the following deployment types.
362
362
363
363
### Standard
364
364
365
-
[Standard deployments](./deployment-types.md#standard) provide a pay-per-token billing model with data residency confined to the deployed region.
365
+
[Standard deployments](../../foundry-models/concepts/deployment-types.md) provide a pay-per-token billing model with data residency confined to the deployed region.
366
366
367
367
| Models | East US2 | North Central US | Sweden Central | Switzerland West |
@@ -377,7 +377,7 @@ Azure OpenAI fine-tuning supports the following deployment types.
377
377
378
378
### Global Standard
379
379
380
-
[Global standard](./deployment-types.md#global-standard) fine-tuned deployments offer [cost savings](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/), but custom model weights may temporarily be stored outside the geography of your Azure OpenAI resource.
380
+
[Global standard](../../foundry-models/concepts/deployment-types.md) fine-tuned deployments offer [cost savings](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/), but custom model weights may temporarily be stored outside the geography of your Azure OpenAI resource.
381
381
382
382
Global standard deployments are available from all Azure OpenAI regions for the following models:
383
383
@@ -392,7 +392,7 @@ Global standard deployments are available from all Azure OpenAI regions for the
392
392
393
393
### Developer Tier
394
394
395
-
[Developer](./deployment-types.md#developer-for-fine-tuned-models) fine-tuned deployments offer a similar experience as [Global Standard](#global-standard) without an hourly hosting fee, but do not offer an availability SLA. Developer deployments are designed for model candidate evaluation and not for production use.
395
+
[Developer](../../foundry-models/concepts/deployment-types.md) fine-tuned deployments offer a similar experience as [Global Standard](#global-standard) without an hourly hosting fee, but do not offer an availability SLA. Developer deployments are designed for model candidate evaluation and not for production use.
396
396
397
397
Developer deployments are available from all Azure OpenAI regions for the following models:
398
398
@@ -409,7 +409,7 @@ Developer deployments are available from all Azure OpenAI regions for the follow
409
409
| GPT-4o | ✅ | ✅ |
410
410
| GPT-4o-mini | ✅ | ✅ |
411
411
412
-
[Provisioned throughput](./deployment-types.md#regional-provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
412
+
[Provisioned throughput](../../foundry-models/concepts/deployment-types.md) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
413
413
414
414
## Clean up your deployment
415
415
@@ -433,4 +433,4 @@ You can also delete a deployment in Azure AI Foundry portal, or use [Azure CLI](
0 commit comments