Skip to content

Commit 45e257e

Browse files
committed
update
1 parent c055d6c commit 45e257e

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

articles/ai-services/openai/how-to/deployment-types.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ For any [deployment type](/azure/ai-services/openai/how-to/deployment-types) lab
3939
> [!IMPORTANT]
4040
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
4141
42-
**SKU name in code:** GlobalStandard
42+
**SKU name in code:** `GlobalStandard`
4343

4444
Global deployments are available in the same Azure OpenAI resources as non-global deployment types but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard provides the highest default quota and eliminates the need to load balance across multiple resources.
4545

@@ -50,7 +50,7 @@ Customers with high consistent volume may experience greater latency variability
5050
> [!IMPORTANT]
5151
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
5252
53-
**SKU name in code:** GlobalProvisionedManaged
53+
**SKU name in code:** `GlobalProvisionedManaged`
5454

5555
Global deployments are available in the same Azure OpenAI resources as non-global deployment types but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global provisioned deployments provide reserved model processing capacity for high and predictable throughput using Azure global infrastructure.
5656

@@ -61,7 +61,7 @@ Global deployments are available in the same Azure OpenAI resources as non-globa
6161
6262
[Global batch](./batch.md) is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
6363

64-
**SKU name in code:** GlobalBatch
64+
**SKU name in code:** `GlobalBatch`
6565

6666
Key use cases include:
6767

@@ -84,7 +84,7 @@ Key use cases include:
8484
> [!IMPORTANT]
8585
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location within the Microsoft specified data zone. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
8686
87-
**SKU name in code:** DataZoneStandard
87+
**SKU name in code:** `DataZoneStandard`
8888

8989
Data zone standard deployments are available in the same Azure OpenAI resource as all other Azure OpenAI deployment types but allow you to leverage Azure global infrastructure to dynamically route traffic to the data center within the Microsoft defined data zone with the best availability for each request. Data zone standard provides higher default quotas than our Azure geography-based deployment types.
9090

@@ -95,7 +95,7 @@ Customers with high consistent volume may experience greater latency variability
9595
> [!IMPORTANT]
9696
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location within the Microsoft specified data zone.[Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
9797
98-
**SKU name in code:** DataZoneProvisionedManaged
98+
**SKU name in code:** `DataZoneProvisionedManaged`
9999

100100
Data zone provisioned deployments are available in the same Azure OpenAI resource as all other Azure OpenAI deployment types but allow you to leverage Azure global infrastructure to dynamically route traffic to the data center within the Microsoft specified data zone with the best availability for each request. Data zone provisioned deployments provide reserved model processing capacity for high and predictable throughput using Azure infrastructure within the Microsoft specified data zone.
101101

@@ -104,21 +104,21 @@ Data zone provisioned deployments are available in the same Azure OpenAI resourc
104104
> [!IMPORTANT]
105105
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location within the Microsoft specified data zone. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
106106
107-
**SKU name in code:** DataZoneBatch
107+
**SKU name in code:** `DataZoneBatch`
108108

109109
Data zone batch deployments provide all the same functionality as [global batch deployments](./batch.md) while allowing you to leverage Azure global infrastructure to dynamically route traffic to only data centers within the Microsoft defined data zone with the best availability for each request.
110110

111111
## Standard
112112

113-
**SKU name in code:** Standard
113+
**SKU name in code:** `Standard`
114114

115115
Standard deployments provide a pay-per-call billing model on the chosen model. Provides the fastest way to get started as you only pay for what you consume. Models available in each region as well as throughput may be limited.
116116

117117
Standard deployments are optimized for low to medium volume workloads with high burstiness. Customers with high consistent volume may experience greater latency variability.
118118

119119
## Provisioned
120120

121-
**SKU name in code:** ProvisionedManaged
121+
**SKU name in code:** `ProvisionedManaged`
122122

123123
Provisioned deployments allow you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU. Learn more from our [Provisioned throughput concepts article](../concepts/provisioned-throughput.md).
124124

0 commit comments

Comments
 (0)