You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -34,7 +34,7 @@ GPT-4o is the latest model from OpenAI. GPT-4o integrates text and images in a s
34
34
35
35
GPT-4o is available for **standard** and **global-standard** model deployment.
36
36
37
-
You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability-preview) region where the model is available.
37
+
You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability) region where the model is available.
38
38
39
39
When your resource is created, you can [deploy](../how-to/create-resource.md#deploy-a-model) the GPT-4o model. If you are performing a programmatic deployment, the **model** name is `gpt-4o`, and the **version** is `2024-05-13`.
40
40
@@ -164,7 +164,7 @@ You need to speak with your Microsoft sales/account team to acquire provisioned
164
164
165
165
For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/deployment-types.md
+3-5Lines changed: 3 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: mrbullwinkle
7
7
manager: nitinme
8
8
ms.service: azure-ai-openai
9
9
ms.topic: how-to
10
-
ms.date: 05/19/2024
10
+
ms.date: 07/01/2024
11
11
ms.author: mbullwin
12
12
---
13
13
@@ -28,7 +28,7 @@ Our global deployments will be the first location for all new models and feature
28
28
29
29
Azure OpenAI offers three types of deployments. These provide a varied level of capabilities that provide trade-offs on: throughput, SLAs, and price. Below is a summary of the options followed by a deeper description of each.
|**Best suited for**| Applications that don’t require data residency. Recommended starting place for customers. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.|
34
34
|**How it works**| Traffic may be routed anywhere in the world |||
@@ -40,8 +40,6 @@ Azure OpenAI offers three types of deployments. These provide a varied level of
40
40
|**Sku Name in code**|`GlobalStandard`|`Standard`|`ProvisionedManaged`|
<sup>**1**</sup> Global-Standard deployment type is currently in preview.
44
-
45
43
## Provisioned
46
44
47
45
Provisioned deployments allow you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU. Learn more from our [Provisioned throughput concepts article](../concepts/provisioned-throughput.md).
@@ -52,7 +50,7 @@ Standard deployments provide a pay-per-call billing model on the chosen model. P
52
50
53
51
Standard deployments are optimized for low to medium volume workloads with high burstiness. Customers with high consistent volume may experience greater latency variability.
54
52
55
-
## Global standard (preview)
53
+
## Global standard
56
54
57
55
Global deployments are available in the same Azure OpenAI resources as non-global offers but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard will provide the highest default quota for new models and eliminates the need to load balance across multiple resources.
0 commit comments