You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/deployment-types.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ Azure AI Foundry makes models available by using the model deployment concept in
17
17
18
18
Azure AI Foundry models provide customers with hosting structure choices that fit their business and usage patterns. Those options are translated to different deployments types (or SKUs) that are available at model deployment time in the Azure AI Foundry resource.
19
19
20
-
The service offers two main types of deployments: *standard* and *provisioned*. For a given deployment type, customers can align their workloads with their dataprocessing requirements. They can choose an Azure geography (`Standard` or `Provisioned-Managed`), a Microsoft-specified data zone (`DataZone-Standard` or `DataZone Provisioned-Managed`), or a global (`Global-Standard` or `Global Provisioned-Managed`) processing option.
20
+
The service offers two main types of deployments: *standard* and *provisioned*. For a given deployment type, customers can align their workloads with their data-processing requirements. They can choose an Azure geography (`Standard` or `Provisioned-Managed`), a Microsoft-specified data zone (`DataZone-Standard` or `DataZone Provisioned-Managed`), or a global (`Global-Standard` or `Global Provisioned-Managed`) processing option.
21
21
22
22
For fine-tuned models, an additional `Developer` deployment type provides a cost-efficient means of custom model evaluation, but without data residency.
23
23
@@ -40,7 +40,7 @@ Our global deployments are the first location for all new models and features. D
40
40
41
41
For any [deployment type](/azure/ai-foundry/openai/how-to/deployment-types) labeled **Global**, prompts and responses might be processed in any geography where the relevant Azure AI Foundry model is deployed. Learn more about [region availability of models](/azure/ai-foundry/openai/concepts/models#model-summary-table-and-region-availability).
42
42
43
-
For any deployment type labeled as **DataZone**, prompts and responses might be processed in any geography within the specified data zone, as defined by Microsoft. If you create a **DataZone** deployment in an Azure AI Foundry resource located in the United States, prompts and responses might be processed anywhere within the United States. If you create a **DataZone** deployment in an Azure AI Foundry resource located in a European Union Member Nation, prompts and responses might be processed in that or any other European Union Member Nation.
43
+
For any deployment type labeled as **DataZone**, prompts and responses might be processed in any geography within the specified data zone, as defined by Microsoft. If you create a **DataZone** deployment in an Azure AI Foundry resource located in the United States, prompts and responses might be processed anywhere within the United States. If you create a **DataZone** deployment in an Azure AI Foundry resource located in a European Union member nation, prompts and responses might be processed in that or any other European Union member nation.
44
44
45
45
For both **Global** and **DataZone** deployment types, any data stored at rest, such as uploaded data, is stored in the customer-designated geography. Only the location of processing is affected when a customer uses a **Global** or **DataZone** deployment type in an Azure AI Foundry resource; Azure data processing and compliance commitments remain applicable.
46
46
@@ -49,32 +49,32 @@ For both **Global** and **DataZone** deployment types, any data stored at rest,
49
49
50
50
## Global Standard
51
51
52
+
- SKU name in code: `GlobalStandard`
53
+
52
54
> [!IMPORTANT]
53
55
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
54
56
55
-
- SKU name in code: `GlobalStandard`
56
-
57
57
Global deployments are available in the same Azure AI Foundry resources as non-global deployment types. However, they allow you to use the global infrastructure of Azure to dynamically route traffic to the data center with the best availability for each request. Global Standard provides the highest default quota and eliminates the need to load balance across multiple resources.
58
58
59
59
Customers with high consistent volume might experience greater latency variability. The threshold is set per model. To learn more, see the [Quotas page](./quota.md). For applications that require lower latency variance at large workload usage, we recommend purchasing provisioned throughput.
60
60
61
61
## Global Provisioned
62
62
63
+
- SKU name in code: `GlobalProvisionedManaged`
64
+
63
65
> [!IMPORTANT]
64
66
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
65
67
66
-
- SKU name in code: `GlobalProvisionedManaged`
67
-
68
-
Global deployments are available in the same Azure AI Foundry resources as non-global deployment types. However, they allow you to use the global infrastructure of Azure to dynamically route traffic to the data center with the best availability for each request. Global-Provisioned deployments provide reserved model processing capacity for high and predictable throughput by using Azure global infrastructure.
68
+
Global deployments are available in the same Azure AI Foundry resources as non-global deployment types. However, they allow you to use the global infrastructure of Azure to dynamically route traffic to the data center with the best availability for each request. Global Provisioned deployments provide reserved model processing capacity for high and predictable throughput by using Azure global infrastructure.
69
69
70
70
## Global Batch
71
71
72
+
- SKU name in code: `GlobalBatch`
73
+
72
74
> [!IMPORTANT]
73
75
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
74
76
75
-
[Global Batch](./batch.md) is designed to handle large-scale and high-volume processing tasks efficiently. You can process asynchronous groups of requests with separate quota and a 24-hour target turnaround, at [50% less cost than Global Standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than sending one request at a time, you send a large number of requests in a single file. Global Batch requests have a separate enqueued token quota, which avoids any disruption of your online workloads.
76
-
77
-
- SKU name in code: `GlobalBatch`
77
+
[Global Batch](./batch.md) is designed to efficiently handle large-scale and high-volume processing tasks. You can process asynchronous groups of requests with separate quota and a 24-hour target turnaround, at [50% less cost than Global Standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than sending one request at a time, you send a large number of requests in a single file. Global Batch requests have a separate enqueued token quota, which avoids any disruption of your online workloads.
78
78
79
79
Key use cases include:
80
80
@@ -88,31 +88,31 @@ Key use cases include:
88
88
89
89
## Data Zone Standard
90
90
91
+
- SKU name in code: `DataZoneStandard`
92
+
91
93
> [!IMPORTANT]
92
94
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location within the Microsoft-specified data zone. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
93
95
94
-
- SKU name in code: `DataZoneStandard`
95
-
96
96
Data Zone Standard deployments are available in the same Azure AI Foundry resource as all other Azure AI Foundry deployment types. However, they allow you to use the global infrastructure of Azure to dynamically route traffic to the data center within the Microsoft-defined data zone with the best availability for each request. Data Zone Standard provides higher default quotas than our Azure geography-based deployment types.
97
97
98
98
Customers with high consistent volume might experience greater latency variability. The threshold is set per model. To learn more, see the [quotas and limits page](/azure/ai-foundry/openai/quotas-limits#usage-tiers). For workloads that require low latency variance at large volume, we recommend using the provisioned deployment offerings.
99
99
100
100
## Data Zone Provisioned
101
101
102
+
- SKU name in code: `DataZoneProvisionedManaged`
103
+
102
104
> [!IMPORTANT]
103
105
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location within the Microsoft-specified data zone. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
104
106
105
-
SKU name in code: `DataZoneProvisionedManaged`
106
-
107
107
Data Zone Provisioned deployments are available in the same Azure AI Foundry resource as all other Azure AI Foundry deployment types. However, they allow you to use the global infrastructure of Azure to dynamically route traffic to the data center within the Microsoft-specified data zone with the best availability for each request. Data Zone Provisioned deployments provide reserved model processing capacity for high and predictable throughput by using Azure infrastructure within the Microsoft-specified data zone.
108
108
109
109
## Data Zone Batch
110
110
111
+
- SKU name in code: `DataZoneBatch`
112
+
111
113
> [!IMPORTANT]
112
114
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location within the Microsoft-specified data zone. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
113
115
114
-
- SKU name in code: `DataZoneBatch`
115
-
116
116
Data Zone Batch deployments provide all the same functionality as [Global Batch deployments](./batch.md). However, they allow you to use the global infrastructure of Azure to dynamically route traffic to only data centers within the Microsoft-defined data zone with the best availability for each request.
117
117
118
118
## Standard
@@ -157,11 +157,11 @@ You can use the following policy to disable access to any Azure AI Foundry deplo
157
157
158
158
## Developer (for fine-tuned models)
159
159
160
+
- SKU name in code: `DeveloperTier`
161
+
160
162
> [!IMPORTANT]
161
163
> Data stored at rest remains in the designated Azure geography. However, data might be processed for inferencing in any Azure AI Foundry location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
162
164
163
-
- SKU name in code: `DeveloperTier`
164
-
165
165
Fine-tuned models support a `Developer` deployment designed to support custom model evaluation. It doesn't offer data residency guarantees or an SLA. To learn more about using the `Developer` deployment type, see the [fine-tuning guide](./fine-tune-test.md).
0 commit comments