Skip to content

Commit 9aa0433

Browse files
Merge pull request #281763 from mrbullwinkle/mrb_07_24_2024_Global
[Azure OpenAI] Global Standard update
2 parents 1ae08e3 + aa900fb commit 9aa0433

File tree

3 files changed

+28
-25
lines changed

3 files changed

+28
-25
lines changed

articles/ai-services/openai/how-to/deployment-types.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Azure OpenAI offers three types of deployments. These provide a varied level of
3535
| **Getting started** | [Model deployment](./create-resource.md) | [Model deployment](./create-resource.md) | [Provisioned onboarding](./provisioned-throughput-onboarding.md) |
3636
| **Cost** | [Global deployment pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | [Regional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | May experience cost savings for consistent usage |
3737
| **What you get** | Easy access to all new models with highest default pay-per-call limits.<br><br> Customers with high volume usage may see higher latency variability | Easy access with [SLA on availability](https://azure.microsoft.com/support/legal/sla/). Optimized for low to medium volume workloads with high burstiness. <br><br>Customers with high consistent volume may experience greater latency variability. | Regional access with very high & predictable throughput. Determine throughput per PTU using the provided [capacity calculator](./provisioned-throughput-onboarding.md#estimate-provisioned-throughput-and-cost) |
38-
| **What you don’t get** | ❌Data residency guarantees | ❌High volume w/consistent low latency | ❌Pay-per-call flexibility |
38+
| **What you don’t get** |❌Data processing guarantee<br> <br> Data might be processed outside of the resource's Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/) | ❌High volume w/consistent low latency | ❌Pay-per-call flexibility |
3939
| **Per-call Latency** | Optimized for real-time calling & low to medium volume usage. Customers with high volume usage may see higher latency variability. Threshold set per model | Optimized for real-time calling & low to medium volume usage. Customers with high volume usage may see higher latency variability. Threshold set per model | Optimized for real-time. |
4040
| **Sku Name in code** | `GlobalStandard` | `Standard` | `ProvisionedManaged` |
4141
| **Billing model** | Pay-per-token | Pay-per-token | Monthly Commitments |
@@ -52,6 +52,9 @@ Standard deployments are optimized for low to medium volume workloads with high
5252

5353
## Global standard
5454

55+
> [!IMPORTANT]
56+
> Data might be processed outside of the resource's Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
57+
5558
Global deployments are available in the same Azure OpenAI resources as non-global offers but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard will provide the highest default quota for new models and eliminates the need to load balance across multiple resources.
5659

5760
The deployment type is optimized for low to medium volume workloads with high burstiness. Customers with high consistent volume may experience greater latency variability. The threshold is set per model. See the [quotas page to learn more](./quota.md).

0 commit comments

Comments
 (0)