Merge pull request #279787 from mrbullwinkle/mrb_07_01_2024_global_deployments_ga

prmerger-automator[bot] · web-flow · commit 6c7a5b331702 · 2024-07-01T17:56:14.000Z
[Azure OpenAI] Global Standard deployment type GA
diff --git a/articles/ai-services/openai/concepts/models.md b/articles/ai-services/openai/concepts/models.md
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
 description: Learn about the different model capabilities that are available with Azure OpenAI.
 ms.service: azure-ai-openai
 ms.topic: conceptual
-ms.date: 06/25/2024
+ms.date: 07/01/2024
 ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
 manager: nitinme
 author: mrbullwinkle #ChrisHMSFT
@@ -34,7 +34,7 @@ GPT-4o is the latest model from OpenAI. GPT-4o integrates text and images in a s
 
 GPT-4o is available for **standard** and **global-standard** model deployment.
 
-You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability-preview) region where the model is available.
+You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability) region where the model is available.
 
 When your resource is created, you can [deploy](../how-to/create-resource.md#deploy-a-model) the GPT-4o model. If you are performing a programmatic deployment, the **model** name is `gpt-4o`, and the **version** is `2024-05-13`.
 
@@ -164,7 +164,7 @@ You need to speak with your Microsoft sales/account team to acquire provisioned
 
 For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
 
-### Global standard model availability (preview)
+### Global standard model availability
 
 **Supported models:**
 
diff --git a/articles/ai-services/openai/how-to/deployment-types.md b/articles/ai-services/openai/how-to/deployment-types.md
@@ -7,7 +7,7 @@ author: mrbullwinkle
 manager: nitinme
 ms.service: azure-ai-openai
 ms.topic: how-to
-ms.date: 05/19/2024
+ms.date: 07/01/2024
 ms.author: mbullwin
 ---
 
@@ -28,7 +28,7 @@ Our global deployments will be the first location for all new models and feature
 
 Azure OpenAI offers three types of deployments. These provide a varied level of capabilities that provide trade-offs on: throughput, SLAs, and price. Below is a summary of the options followed by a deeper description of each.
 
-| **Offering** | **Global-Standard** <sup>**1**</sup> | **Standard** | **Provisioned**  |
+| **Offering** | **Global-Standard** | **Standard** | **Provisioned**  |
 |---|:---|:---|:---|
 | **Best suited for**      | Applications that don’t require data residency. Recommended starting place for customers. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.|
 | **How it works**         | Traffic may be routed anywhere in the world | | |
@@ -40,8 +40,6 @@ Azure OpenAI offers three types of deployments. These provide a varied level of
 | **Sku Name in code**     |    `GlobalStandard`               | `Standard`   |      `ProvisionedManaged`       |
 | **Billing model**        | Pay-per-token | Pay-per-token | Monthly Commitments |
 
-<sup>**1**</sup> Global-Standard deployment type is currently in preview.
-
 ## Provisioned
 
 Provisioned deployments allow you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU. Learn more from our [Provisioned throughput concepts article](../concepts/provisioned-throughput.md).
@@ -52,7 +50,7 @@ Standard deployments provide a pay-per-call billing model on the chosen model. P
 
 Standard deployments are optimized for low to medium volume workloads with high burstiness. Customers with high consistent volume may experience greater latency variability.
 
-## Global standard (preview)
+## Global standard
 
 Global deployments are available in the same Azure OpenAI resources as non-global offers but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request.  Global standard will provide the highest default quota for new models and eliminates the need to load balance across multiple resources.  
 
diff --git a/articles/ai-services/openai/quotas-limits.md b/articles/ai-services/openai/quotas-limits.md
@@ -10,7 +10,7 @@ ms.custom:
   - ignite-2023
   - references_regions
 ms.topic: conceptual
-ms.date: 06/21/2024
+ms.date: 07/01/2024
 ms.author: mbullwin
 ---
 
@@ -60,9 +60,6 @@ The following sections provide you with a quick guide to the default quotas and
 
 ### gpt-4o global standard
 
-> [!NOTE]
-> The [global standard model deployment type](./how-to/deployment-types.md#deployment-types) is currently in public preview.
-
 |Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
 |---|:---:|:---:|
 |Enterprise agreement | 10 M | 60 K |