Skip to content

Commit 36ba331

Browse files
committed
update
1 parent 2aae86d commit 36ba331

File tree

4 files changed

+9
-9
lines changed

4 files changed

+9
-9
lines changed

articles/ai-services/openai/how-to/batch.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ zone_pivot_groups: openai-fine-tuning-batch
1515

1616
# Getting started with Azure OpenAI global batch deployments (preview)
1717

18-
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, a 24-hour turnaround time, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
18+
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
1919

2020
Key use cases include:
2121

@@ -36,7 +36,7 @@ Key use cases include:
3636
> [!IMPORTANT]
3737
> We aim to process batch requests within 24 hours; we do not expire the jobs that take longer. You can [cancel](#cancel-batch) the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You will be charged for any completed work.
3838
>
39-
> Data may be processed outside of the resource’s Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/). 
39+
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/). 
4040
4141
## Global batch support
4242

@@ -80,7 +80,7 @@ In the Studio UI the deployment type will appear as `Global-Batch`.
8080
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure OpenAI Studio with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
8181

8282
> [!TIP]
83-
> Each line of your input file for batch processing requires the unique **deployment name** that you chose during model deployment to be present. This value wil be assigned to the `model` attribute. This is different from OpenAI where the concept of model deployments does not exist.
83+
> Each line of your input file for batch processing has a model attribute that requires a global batch deployment name. For a given input file, all names must be the same deployment name. This is different from OpenAI where the concept of model deployments does not exist.
8484
8585
::: zone pivot="programming-language-ai-studio"
8686

articles/ai-services/openai/how-to/deployment-types.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Azure OpenAI offers three types of deployments. These provide a varied level of
3535
| **Getting started** | [Global-Batch](./batch.md) | [Model deployment](./create-resource.md) | [Model deployment](./create-resource.md) | [Provisioned onboarding](./provisioned-throughput-onboarding.md) |
3636
| **Cost** | [Least expensive option](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) <br> 50% less cost compared to Global Standard prices. Access to all new models with larger quota allocations. | [Global deployment pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | [Regional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | May experience cost savings for consistent usage |
3737
| **What you get** |[Significant discount compared to Global Standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | Easy access to all new models with highest default pay-per-call limits.<br><br> Customers with high volume usage may see higher latency variability | Easy access with [SLA on availability](https://azure.microsoft.com/support/legal/sla/). Optimized for low to medium volume workloads with high burstiness. <br><br>Customers with high consistent volume may experience greater latency variability. | Regional access with very high & predictable throughput. Determine throughput per PTU using the provided [capacity calculator](./provisioned-throughput-onboarding.md#estimate-provisioned-throughput-and-cost) |
38-
| **What you don’t get** |❌Real-time call performance |❌Data processing guarantee<br> <br> Data might be processed outside of the resource's Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/) | ❌High volume w/consistent low latency | ❌Pay-per-call flexibility |
38+
| **What you don’t get** |❌Real-time call performance <br><br>❌Data processing guarantee<br> <br> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/) |❌Data processing guarantee<br> <br> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/) | ❌High volume w/consistent low latency | ❌Pay-per-call flexibility |
3939
| **Per-call Latency** | Not Applicable (file based async process) | Optimized for real-time calling & low to medium volume usage. Customers with high volume usage may see higher latency variability. Threshold set per model | Optimized for real-time calling & low to medium volume usage. Customers with high volume usage may see higher latency variability. Threshold set per model | Optimized for real-time. |
4040
| **Sku Name in code** | `GlobalBatch` | `GlobalStandard` | `Standard` | `ProvisionedManaged` |
4141
| **Billing model** | Pay-per-token |Pay-per-token | Pay-per-token | Monthly Commitments |
@@ -53,7 +53,7 @@ Standard deployments are optimized for low to medium volume workloads with high
5353
## Global standard
5454

5555
> [!IMPORTANT]
56-
> Data might be processed outside of the resource's Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
56+
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
5757
5858
Global deployments are available in the same Azure OpenAI resources as non-global deployment types but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request. Global standard provides the highest default quota and eliminates the need to load balance across multiple resources.
5959

@@ -62,9 +62,9 @@ Customers with high consistent volume may experience greater latency variability
6262
## Global batch
6363

6464
> [!IMPORTANT]
65-
> Data might be processed outside of the resource's Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
65+
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
6666
67-
[Global batch](./batch.md) is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, a 24-hour turnaround time, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
67+
[Global batch](./batch.md) is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
6868

6969
Key use cases include:
7070

articles/ai-services/openai/includes/global-batch-limits.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ The table shows the batch quota limit. Quota values for global batch are represe
2323
|Model|Enterprise agreement|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
2424
|---|---|---|---|---|---|
2525
| `gpt-4o` | 5 B | 50 M | 1.35 M | 90 K | N/A|
26-
| `gpt-4-turbo`<sup>*</sup> | 300 M | 40 M | 1.35 M | 90 K | N/A |
26+
| `gpt-4-turbo` | 300 M | 40 M | 1.35 M | 90 K | N/A |
2727
| `gpt-4` | 150 M | 5 M | 200 K | 100 K | N/A |
2828
| `gpt-35-turbo` | 10 B | 100 M | 5 M | 2 M | 50 K |
2929

articles/ai-services/openai/whats-new.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ This article provides a summary of the latest releases and major documentation u
2222

2323
### Global batch deployments are now available
2424

25-
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, a 24-hour turnaround time, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
25+
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
2626

2727
Key use cases include:
2828

0 commit comments

Comments
 (0)