Skip to content

Commit 2f0f904

Browse files
Merge pull request #7541 from MicrosoftDocs/main
Auto Publish – main to live - 2025-10-08 05:06 UTC
2 parents fa0a155 + 75c62df commit 2f0f904

33 files changed

+180
-115
lines changed

articles/ai-foundry/.openpublishing.redirection.ai-studio.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,11 @@
210210
"redirect_url": "/azure/ai-foundry/how-to/deploy-models-llama",
211211
"redirect_document_id": true
212212
},
213+
{
214+
"source_path_from_root": "/articles/ai-foundry/open-ai/how-to/deployment-types.md",
215+
"redirect_url": "/azure/ai-foundry/foundry-models/concepts/deployment-types",
216+
"redirect_document_id": true
217+
},
213218
{
214219
"source_path_from_root": "/articles/ai-foundry/how-to/deploy-models-llama.md",
215220
"redirect_url": "/azure/ai-foundry/concepts/models-featured#meta",

articles/ai-foundry/agents/concepts/model-region-support.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Agents are powered by a diverse set of Azure OpenAI models with different capabi
1919
- **Standard** is offered with a global deployment option, routing traffic globally to provide higher throughput.
2020
- **Provisioned** is also offered with a global deployment option, allowing customers to purchase and deploy provisioned throughput units across Azure global infrastructure.
2121

22-
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see [deployment types guide](../../openai/how-to/deployment-types.md).
22+
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see [deployment types guide](../../foundry-models/concepts/deployment-types.md).
2323

2424
## Available models
2525

@@ -130,4 +130,4 @@ Azure AI Foundry Agent Service supports the following Azure OpenAI models in the
130130

131131
## Next steps
132132

133-
[Create a new Agent project](../quickstart.md)
133+
[Create a new Agent project](../quickstart.md)

articles/ai-foundry/foundry-models/concepts/deployment-types.md

Lines changed: 106 additions & 49 deletions
Large diffs are not rendered by default.

articles/ai-foundry/openai/audio-completions-quickstart.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,5 +60,5 @@ If you want to clean up and remove an Azure OpenAI resource, you can delete the
6060
6161
## Related content
6262

63-
* Learn more about Azure OpenAI [deployment types](./how-to/deployment-types.md).
63+
* Learn more about Azure OpenAI [deployment types](../foundry-models/concepts/deployment-types.md).
6464
* Learn more about Azure OpenAI [quotas and limits](quotas-limits.md).

articles/ai-foundry/openai/concepts/provisioned-migration.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ This article is intended for existing users of the provisioned throughput offeri
3434
|Self-service quota requests | Request quota increases without engaging the sales team – many can be autoapproved. |
3535
|Default provisioned-managed quota in many regions | Get started quickly without having to first request quota. |
3636
|Transparent information on real-time capacity availability + New deployment flow | Reduced negotiation around availability accelerates time-to-market. |
37-
| Data zone provisioned deployments | Allows you to leverage Azure's global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. For more information, see the [deployment types](../how-to/deployment-types.md#data-zone-provisioned) article. |
37+
| Data zone provisioned deployments | Allows you to leverage Azure's global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. For more information, see the [deployment types](../../foundry-models/concepts/deployment-types.md#data-zone-provisioned) article. |
3838

3939
### New hourly/reservation commercial model
4040

@@ -45,7 +45,7 @@ This article is intended for existing users of the provisioned throughput offeri
4545
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota. |
4646
| Flexible choice of payment model for existing provisioned customers | Customers with commitments can stay on the commitment model until the end of life of the currently supported models, and can choose to migrate existing commitments to hourly/reservations via managed process. We recommend migrating to hourly/ reservations to take advantage of term discounts and to work with the latest models. |
4747
| Supports latest model generations | The latest models are available only on hourly/ reservations in provisioned offering. |
48-
| Differentiated pricing | Greater flexibility and control of pricing and performance. In December 2024, we introduced differentiated hourly pricing across [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned), and [regional provisioned](../how-to/deployment-types.md#regional-provisioned) deployment types with the option to purchase [Azure Reservations](#new-azure-reservations-for-global-and-data-zone-provisioned-deployments) to support additional discounts. For more information on the hourly price for each provisioned deployment type, see the [Pricing details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) page. |
48+
| Differentiated pricing | Greater flexibility and control of pricing and performance. In December 2024, we introduced differentiated hourly pricing across [global provisioned](../../foundry-models/concepts/deployment-types.md#global-provisioned), [data zone provisioned](../../foundry-models/concepts/deployment-types.md#data-zone-provisioned), and [regional provisioned](../../foundry-models/concepts/deployment-types.md#regional-provisioned) deployment types with the option to purchase [Azure Reservations](#new-azure-reservations-for-global-and-data-zone-provisioned-deployments) to support additional discounts. For more information on the hourly price for each provisioned deployment type, see the [Pricing details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) page. |
4949

5050
## Usability improvement details
5151

articles/ai-foundry/openai/concepts/provisioned-throughput.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Provisioned throughput provides:
3131

3232
> [!TIP]
3333
> * You can take advantage of more cost savings when you buy [Microsoft Azure AI Foundry Provisioned Throughput reservations](/azure/cost-management-billing/reservations/azure-openai#buy-a-microsoft-azure-openai-service-reservation).
34-
> * Provisioned throughput is available as the following deployment types: [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned) and [regional provisioned](../how-to/deployment-types.md#regional-provisioned).
34+
> * Provisioned throughput is available as the following deployment types: [global provisioned](../../foundry-models/concepts/deployment-types.md#global-provisioned), [data zone provisioned](../../foundry-models/concepts/deployment-types.md#data-zone-provisioned) and [regional provisioned](../../foundry-models/concepts/deployment-types.md#regional-provisioned).
3535
3636

3737
<!--

articles/ai-foundry/openai/how-to/batch.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ The following aren't currently supported:
9292
### Batch deployment
9393

9494
> [!NOTE]
95-
> In the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) the batch deployment types will appear as `Global-Batch` and `Data Zone Batch`. To learn more about Azure OpenAI deployment types, see our [deployment types guide](../how-to/deployment-types.md).
95+
> In the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) the batch deployment types will appear as `Global-Batch` and `Data Zone Batch`. To learn more about Azure OpenAI deployment types, see our [deployment types guide](../../foundry-models/concepts/deployment-types.md).
9696
9797
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure AI Foundry portal with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
9898

@@ -246,5 +246,5 @@ When a job failure occurs, you'll find details about the failure in the `errors`
246246

247247
## See also
248248

249-
* Learn more about Azure OpenAI [deployment types](./deployment-types.md)
249+
* Learn more about Azure OpenAI [deployment types](../../foundry-models/concepts/deployment-types.md)
250250
* Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md)

articles/ai-foundry/openai/how-to/deployment-types.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.custom:
1212
- build-2025
1313
---
1414

15-
# Deployment types for Azure AI Foundry Models
15+
# Understanding Deployment types for Azure AI Foundry Models
1616

1717
Azure AI Foundry makes models available by using the model deployment concept in Azure AI Foundry Services (formerly known as Azure AI Services). Model deployments are also Azure resources and, when created, give access to a given model under certain configurations. Such a configuration includes the infrastructure required to process the requests.
1818

@@ -33,7 +33,7 @@ For standard deployments, there are three deployment-type options to choose from
3333

3434
### Global deployments
3535

36-
Global deployments use the global infrastructure of Azure to dynamically route customer traffic to the datacenter with the best availability for the customer's inference requests. This means that global offers the highest initial throughput limits and best model availability, but still provides our uptime SLA and low latency. For high-volume workloads above the specified usage tiers on Standard and Global Standard, you might experience increased latency variation. For customers that require the lower latency variance at large workload usage, we recommend using our provisioned deployment types.
36+
Global deployments use the global infrastructure of Azure to dynamically route customer traffic to the datacenter with the best availability for the customer's inference requests. This means that global offers the highest initial throughput limits and best model availability, but still provides our uptime [SLA](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services) and low latency. For high-volume workloads above the specified usage tiers on Standard and Global Standard, you might experience increased latency variation. For customers that require the lower latency variance at large workload usage, we recommend using our provisioned deployment types.
3737

3838
Our global deployments are the first location for all new models and features. Depending on call volume, customers with large volume and low latency variance requirements should consider our provisioned deployment types.
3939

articles/ai-foundry/openai/how-to/fine-tune-test.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ After you've fine-tuned a model, you may want to test its quality via the Chat C
2020
A Developer Tier deployment allows you to deploy your new model without the hourly hosting fee incurred by Standard or Global deployments. The only charges incurred are per-token. Consult the [pricing page](https://aka.ms/aoaipricing) for the most up-to-date pricing.
2121

2222
> [!IMPORTANT]
23-
> Developer Tier offers no availability SLA and no [data residency](https://aka.ms/data-residency) guarantees. If you require an SLA or data residency, choose an alternative [deployment type](./deployment-types.md) for testing your model.
23+
> Developer Tier offers no availability SLA and no [data residency](https://aka.ms/data-residency) guarantees. If you require an SLA or data residency, choose an alternative [deployment type](../../foundry-models/concepts/deployment-types.md) for testing your model.
2424
>
2525
> Developer Tier deployments have a fixed lifetime of **24 hours**. Learn more [below](#clean-up-your-deployment) about the deployment lifecycle.
2626
@@ -213,4 +213,4 @@ curl -X DELETE "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resour
213213

214214
- [Deploy for production](./fine-tuning-deploy.md)
215215
- Understand [Azure OpenAI Quotas & limits](./quota.md)
216-
- Read more about other [Azure OpenAI deployment types](./deployment-types.md)
216+
- Read more about other [Azure OpenAI deployment types](../../foundry-models/concepts/deployment-types.md)

articles/ai-foundry/openai/how-to/fine-tuning-deploy.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Once your model is fine-tuned, you can deploy the model and can use it in your o
1818

1919
When you deploy the model, you make the model available for inferencing, and that incurs an hourly hosting charge. Fine-tuned models, however, can be stored in Azure AI Foundry at no cost until you're ready to use them.
2020

21-
Azure OpenAI provides choices of deployment types for fine-tuned models on the hosting structure that fits different business and usage patterns: **Standard**, **Global Standard** (preview) and **Provisioned Throughput** (preview). Learn more about [deployment types for fine-tuned models](#deployment-types) and the [concepts of all deployment types](./deployment-types.md).
21+
Azure OpenAI provides choices of deployment types for fine-tuned models on the hosting structure that fits different business and usage patterns: **Standard**, **Global Standard** (preview) and **Provisioned Throughput** (preview). Learn more about [deployment types for fine-tuned models](#deployment-types) and the [concepts of all deployment types](../../foundry-models/concepts/deployment-types.md).
2222

2323
## Deploy your fine-tuned model
2424

@@ -362,7 +362,7 @@ Azure OpenAI fine-tuning supports the following deployment types.
362362

363363
### Standard
364364

365-
[Standard deployments](./deployment-types.md#standard) provide a pay-per-token billing model with data residency confined to the deployed region.
365+
[Standard deployments](../../foundry-models/concepts/deployment-types.md) provide a pay-per-token billing model with data residency confined to the deployed region.
366366

367367
| Models | East US2 | North Central US | Sweden Central | Switzerland West |
368368
|--------------------|:--------:|:----------------:|:--------------:|:----------------:|
@@ -377,7 +377,7 @@ Azure OpenAI fine-tuning supports the following deployment types.
377377

378378
### Global Standard
379379

380-
[Global standard](./deployment-types.md#global-standard) fine-tuned deployments offer [cost savings](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/), but custom model weights may temporarily be stored outside the geography of your Azure OpenAI resource.
380+
[Global standard](../../foundry-models/concepts/deployment-types.md) fine-tuned deployments offer [cost savings](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/), but custom model weights may temporarily be stored outside the geography of your Azure OpenAI resource.
381381

382382
Global standard deployments are available from all Azure OpenAI regions for the following models:
383383

@@ -392,7 +392,7 @@ Global standard deployments are available from all Azure OpenAI regions for the
392392

393393
### Developer Tier
394394

395-
[Developer](./deployment-types.md#developer-for-fine-tuned-models) fine-tuned deployments offer a similar experience as [Global Standard](#global-standard) without an hourly hosting fee, but do not offer an availability SLA. Developer deployments are designed for model candidate evaluation and not for production use.
395+
[Developer](../../foundry-models/concepts/deployment-types.md) fine-tuned deployments offer a similar experience as [Global Standard](#global-standard) without an hourly hosting fee, but do not offer an availability SLA. Developer deployments are designed for model candidate evaluation and not for production use.
396396

397397
Developer deployments are available from all Azure OpenAI regions for the following models:
398398

@@ -409,7 +409,7 @@ Developer deployments are available from all Azure OpenAI regions for the follow
409409
| GPT-4o |||
410410
| GPT-4o-mini |||
411411

412-
[Provisioned throughput](./deployment-types.md#regional-provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
412+
[Provisioned throughput](../../foundry-models/concepts/deployment-types.md) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
413413

414414
## Clean up your deployment
415415

@@ -433,4 +433,4 @@ You can also delete a deployment in Azure AI Foundry portal, or use [Azure CLI](
433433
## Next steps
434434

435435
- [Azure OpenAI Quotas & limits](./quota.md)
436-
- [Azure OpenAI deployment types](./deployment-types.md)
436+
- [Azure OpenAI deployment types](../../foundry-models/concepts/deployment-types.md)

0 commit comments

Comments
 (0)