Skip to content

Commit 6c7a5b3

Browse files
Merge pull request #279787 from mrbullwinkle/mrb_07_01_2024_global_deployments_ga
[Azure OpenAI] Global Standard deployment type GA
2 parents 3e4901c + 1a7c650 commit 6c7a5b3

File tree

3 files changed

+7
-12
lines changed

3 files changed

+7
-12
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn about the different model capabilities that are available with Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 06/25/2024
7+
ms.date: 07/01/2024
88
ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
99
manager: nitinme
1010
author: mrbullwinkle #ChrisHMSFT
@@ -34,7 +34,7 @@ GPT-4o is the latest model from OpenAI. GPT-4o integrates text and images in a s
3434

3535
GPT-4o is available for **standard** and **global-standard** model deployment.
3636

37-
You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability-preview) region where the model is available.
37+
You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability) region where the model is available.
3838

3939
When your resource is created, you can [deploy](../how-to/create-resource.md#deploy-a-model) the GPT-4o model. If you are performing a programmatic deployment, the **model** name is `gpt-4o`, and the **version** is `2024-05-13`.
4040

@@ -164,7 +164,7 @@ You need to speak with your Microsoft sales/account team to acquire provisioned
164164

165165
For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
166166

167-
### Global standard model availability (preview)
167+
### Global standard model availability
168168

169169
**Supported models:**
170170

articles/ai-services/openai/how-to/deployment-types.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: mrbullwinkle
77
manager: nitinme
88
ms.service: azure-ai-openai
99
ms.topic: how-to
10-
ms.date: 05/19/2024
10+
ms.date: 07/01/2024
1111
ms.author: mbullwin
1212
---
1313

@@ -28,7 +28,7 @@ Our global deployments will be the first location for all new models and feature
2828

2929
Azure OpenAI offers three types of deployments. These provide a varied level of capabilities that provide trade-offs on: throughput, SLAs, and price. Below is a summary of the options followed by a deeper description of each.
3030

31-
| **Offering** | **Global-Standard** <sup>**1**</sup> | **Standard** | **Provisioned** |
31+
| **Offering** | **Global-Standard** | **Standard** | **Provisioned** |
3232
|---|:---|:---|:---|
3333
| **Best suited for** | Applications that don’t require data residency. Recommended starting place for customers. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.|
3434
| **How it works** | Traffic may be routed anywhere in the world | | |
@@ -40,8 +40,6 @@ Azure OpenAI offers three types of deployments. These provide a varied level of
4040
| **Sku Name in code** | `GlobalStandard` | `Standard` | `ProvisionedManaged` |
4141
| **Billing model** | Pay-per-token | Pay-per-token | Monthly Commitments |
4242

43-
<sup>**1**</sup> Global-Standard deployment type is currently in preview.
44-
4543
## Provisioned
4644

4745
Provisioned deployments allow you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU. Learn more from our [Provisioned throughput concepts article](../concepts/provisioned-throughput.md).
@@ -52,7 +50,7 @@ Standard deployments provide a pay-per-call billing model on the chosen model. P
5250

5351
Standard deployments are optimized for low to medium volume workloads with high burstiness. Customers with high consistent volume may experience greater latency variability.
5452

55-
## Global standard (preview)
53+
## Global standard
5654

5755
Global deployments are available in the same Azure OpenAI resources as non-global offers but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard will provide the highest default quota for new models and eliminates the need to load balance across multiple resources.
5856

articles/ai-services/openai/quotas-limits.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.custom:
1010
- ignite-2023
1111
- references_regions
1212
ms.topic: conceptual
13-
ms.date: 06/21/2024
13+
ms.date: 07/01/2024
1414
ms.author: mbullwin
1515
---
1616

@@ -60,9 +60,6 @@ The following sections provide you with a quick guide to the default quotas and
6060

6161
### gpt-4o global standard
6262

63-
> [!NOTE]
64-
> The [global standard model deployment type](./how-to/deployment-types.md#deployment-types) is currently in public preview.
65-
6663
|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
6764
|---|:---:|:---:|
6865
|Enterprise agreement | 10 M | 60 K |

0 commit comments

Comments
 (0)