Skip to content

Commit 69004d9

Browse files
authored
Merge pull request #283254 from mrbullwinkle/mrb_08_01_2024_batch-002
[Azure OpenAI] [Release branch] Batch
2 parents 2ba3fb5 + fac127c commit 69004d9

22 files changed

+968
-17
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,30 @@ For more information on Provisioned deployments, see our [Provisioned guidance](
202202

203203
- eastus
204204

205+
206+
### Global batch model availability
207+
208+
### Region and model support
209+
210+
The following models support global batch:
211+
212+
| Model | Version | Input format |
213+
|---|---|
214+
|`gpt-4o` | 2024-05-13 |text + image |
215+
|`gpt-4o-mini` | 2024-07-18 |text + image |
216+
|`gpt-4` | turbo-2024-04-09 | text |
217+
|`gpt-4` | 0613 | text |
218+
| `gpt-35-turbo` | 0125 | text |
219+
| `gpt-35-turbo` | 1106 | text |
220+
| `gpt-35-turbo` | 0613 | text |
221+
222+
Global batch is currently supported in the following regions:
223+
224+
- East US
225+
- West US
226+
- Sweden Central
227+
- South India
228+
205229
### GPT-4 and GPT-4 Turbo model availability
206230

207231
#### Public cloud regions
Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
---
2+
title: 'How to use global batch processing with Azure OpenAI Service'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to use global batch with Azure OpenAI Service
5+
manager: nitinme
6+
ms.service: azure-ai-openai
7+
ms.custom:
8+
ms.topic: how-to
9+
ms.date: 08/01/2024
10+
author: mrbullwinkle
11+
ms.author: mbullwin
12+
recommendations: false
13+
zone_pivot_groups: openai-fine-tuning-batch
14+
---
15+
16+
# Getting started with Azure OpenAI global batch deployments (preview)
17+
18+
The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, a 24-hour turnaround time, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads.
19+
20+
Key use cases include:
21+
22+
* **Large-Scale Data Processing:** Quickly analyze extensive datasets in parallel.
23+
24+
* **Content Generation:** Create large volumes of text, such as product descriptions or articles.
25+
26+
* **Document Review and Summarization:** Automate the review and summarization of lengthy documents.
27+
28+
* **Customer Support Automation:** Handle numerous queries simultaneously for faster responses.
29+
30+
* **Data Extraction and Analysis:** Extract and analyze information from vast amounts of unstructured data.
31+
32+
* **Natural Language Processing (NLP) Tasks:** Perform tasks like sentiment analysis or translation on large datasets.
33+
34+
* **Marketing and Personalization:** Generate personalized content and recommendations at scale.
35+
36+
> [!IMPORTANT]
37+
> We aim to process batch requests within 24 hours; we do not expire the jobs that take longer. You can [cancel](#cancel-batch) the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You will be charged for any completed work.
38+
>
39+
> Data may be processed outside of the resource’s Azure geography, but data storage remains in its Azure geography. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/). 
40+
41+
## Global batch support
42+
43+
### Region and model support
44+
45+
Global batch is currently supported in the following regions:
46+
47+
- East US
48+
- West US
49+
- Sweden Central
50+
- South India
51+
52+
The following models support global batch:
53+
54+
| Model | Version | Supported |
55+
|---|---|
56+
|`gpt-4o` | 2024-05-13 |Yes (text + vision) |
57+
|`gpt-4o-mini` | 2024-07-18 |Yes (text + vision) |
58+
|`gpt-4` | turbo-2024-04-09 | Yes (text only) |
59+
|`gpt-4` | 0613 | Yes |
60+
| `gpt-35-turbo` | 0125 | Yes |
61+
| `gpt-35-turbo` | 1106 | Yes |
62+
| `gpt-35-turbo` | 0613 | Yes |
63+
64+
65+
Refer to the [models page](../concepts/models.md) for the most up-to-date information on regions/models where global batch is currently supported.
66+
67+
### API Versions
68+
69+
- `2024-07-01-preview`
70+
71+
### Not supported
72+
73+
The following aren't currently supported:
74+
75+
- Integration with the Assistants API.
76+
- Integration with Azure OpenAI On Your Data feature.
77+
78+
### Global batch deployment
79+
80+
In the Studio UI the deployment type will appear as `Global-Batch`.
81+
82+
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure OpenAI Studio with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
83+
84+
> [!TIP]
85+
> Each line of your input file for batch processing requires the unique **deployment name** that you chose during model deployment to be present. This value wil be assigned to the `model` parameter. This is different from OpenAI where the concept of model deployments does not exist.
86+
87+
::: zone pivot="programming-language-ai-studio"
88+
89+
[!INCLUDE [Studio](../includes/batch/batch-studio.md)]
90+
91+
::: zone-end
92+
93+
::: zone pivot="programming-language-python"
94+
95+
[!INCLUDE [Python](../includes/batch/batch-python.md)]
96+
97+
::: zone-end
98+
99+
::: zone pivot="rest-api"
100+
101+
[!INCLUDE [REST](../includes/batch/batch-rest.md)]
102+
103+
::: zone-end
104+
105+
[!INCLUDE [Quota](../includes/global-batch-limits.md)]
106+
107+
## Batch object
108+
109+
|Property | Type | Definition|
110+
|---|---|---|
111+
| `id` | string | |
112+
| `object` | string| `batch` |
113+
| `endpoint` | string | The API endpoint used by the batch |
114+
| `errors` | object | |
115+
| `input_file_id` | string | The ID of the input file for the batch |
116+
| `completion_window` | string | The time frame within which the batch should be processed |
117+
| `status` | string | The current status of the batch. Possible values: `validating`, `failed`, `in_progress`, `finalizing`, `completed`, `expired`, `cancelling`, `cancelled`. |
118+
| `output_file_id` | string |The ID of the file containing the outputs of successfully executed requests. |
119+
| `error_file_id` | string | The ID of the file containing the outputs of requests with errors. |
120+
| `created_at` | integer | A timestamp when this batch was created (in unix epochs). |
121+
| `in_progress_at` | integer | A timestamp when this batch started progressing (in unix epochs). |
122+
| `expires_at` | integer | A timestamp when this batch will expire (in unix epochs). |
123+
| `finalizing_at` | integer | A timestamp when this batch started finalizing (in unix epochs). |
124+
| `completed_at` | integer | A timestamp when this batch started finalizing (in unix epochs). |
125+
| `failed_at` | integer | A timestamp when this batch failed (in unix epochs) |
126+
| `expired_at` | integer | A timestamp when this batch expired (in unix epochs).|
127+
| `cancelling_at` | integer | A timestamp when this batch started `cancelling` (in unix epochs). |
128+
| `cancelled_at` | integer | A timestamp when this batch was `cancelled` (in unix epochs). |
129+
| `request_counts` | object | Object structure:<br><br> `total` *integer* <br> The total number of requests in the batch. <br>`completed` *integer* <br> The number of requests in the batch that have been completed successfully. <br> `failed` *integer* <br> The number of requests in the batch that have failed.
130+
| `metadata` | map | A set of key-value pairs that can be attached to the batch. This property can be useful for storing additional information about the batch in a structured format. |
131+
132+
## Frequently asked questions (FAQ)
133+
134+
### Can images be used with the batch API?
135+
136+
This capability is limited to certain multi-modal models. Currently only GPT-4o and GPT-4o mini support images as part of batch requests. Images can be provided as input either via [image url or a base64 encoded representation of the image](#input-format). Images for batch are currently not supported with GPT-4 Turbo.
137+
138+
### Can I use the batch API with fine-tuned models?
139+
140+
This is currently not supported.
141+
142+
### Can I use the batch API for embeddings models?
143+
144+
This is currently not supported.
145+
146+
### Does content filtering work with Global Batch deployment?
147+
148+
Yes. Similar to other deployment types, you can create content filters and associate them with the Global Batch deployment type.
149+
150+
### Can I request additional quota?
151+
152+
Yes, from the quota page in the Studio UI. Default quota allocation can be found in the [quota and limits article](../quotas-limits.md#global-batch-quota).
153+
154+
### What happens if the API doesn't complete my request within the 24 hour time frame?
155+
156+
We aim to process these requests within 24 hours; we don't expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.
157+
158+
### How many requests can I queue using batch?
159+
160+
There's no fixed limit on the number of requests you can batch, however, it will depend on your enqueued token quota. Your enqueued token quota includes the maximum number of input tokens you can enqueue at one time.
161+
162+
Once your batch request is completed, your batch rate limit is reset, as your input tokens are cleared. The limit depends on the number of global requests in the queue. If the Batch API queue processes your batches quickly, your batch rate limit is reset more quickly.
163+
164+
## Troubleshooting
165+
166+
A job is successful when `status` is `Completed`. Successful jobs will still generate an error_file_id, but it will be associated with an empty file with zero bytes.
167+
168+
When a job failure occurs, you'll find details about the failure in the `errors` property:
169+
170+
```json
171+
"value": [
172+
{
173+
"cancelled_at": null,
174+
"cancelling_at": null,
175+
"completed_at": "2024-06-27T06:50:01.6603753+00:00",
176+
"completion_window": null,
177+
"created_at": "2024-06-27T06:37:07.3746615+00:00",
178+
"error_file_id": "file-f13a58f6-57c7-44d6-8ceb-b89682588072",
179+
"expired_at": null,
180+
"expires_at": "2024-06-28T06:37:07.3163459+00:00",
181+
"failed_at": null,
182+
"finalizing_at": "2024-06-27T06:49:59.1994732+00:00",
183+
"id": "batch_50fa47a0-ef19-43e5-9577-a4679b92faff",
184+
"in_progress_at": "2024-06-27T06:39:57.455977+00:00",
185+
"input_file_id": "file-42147e78ea42488682f4fd1d73028e72",
186+
"errors": {
187+
"object": “list”,
188+
"data": [
189+
{
190+
“code”: “empty_file”,
191+
“message”: “The input file is empty. Please ensure that the batch contains at least one request.”
192+
}
193+
]
194+
},
195+
"metadata": null,
196+
"object": "batch",
197+
"output_file_id": "file-22d970b7-376e-4223-a307-5bb081ea24d7",
198+
"request_counts": {
199+
"total": 10,
200+
"completed": null,
201+
"failed": null
202+
},
203+
"status": "Failed"
204+
}
205+
```
206+
207+
### Error codes
208+
209+
|Error code | Definition|
210+
|---|---|
211+
|`invalid_json_line`| A line (or multiple) in your input file wasn't able to be parsed as valid json.<br><br> Please ensure no typos, proper opening and closing brackets, and quotes as per JSON standard, and resubmit the request.|
212+
| `too_many_tasks` |The number of requests in the input file exceeds the maximum allowed value of 100,000.<br><br>Please ensure your total requests are under 100,000 and resubmit the job.|
213+
| `url_mismatch` | Either a row in your input file has a URL that doesn’t match the rest of the rows, or the URL specified in the input file doesn’t match the expected endpoint URL. <br><br>Please ensure all request URLs are the same, and that they match the endpoint URL associated with your Azure OpenAI deployment.|
214+
|`model_not_found`|The Azure OpenAI model deployment name that was specified in the `model` property of the input file wasn't found.<br><br> Please ensure this name points to a valid Azure OpenAI model deployment.|
215+
| `duplicate_custom_id` | The custom ID for this request is a duplicate of the custom ID in another request. |
216+
|`empty_batch` | Please check your input file to ensure that the custom ID parameter is unique for each request in the batch.|
217+
|`model_mismatch`| The Azure OpenAI model deployment name that was specified in the `model` property of this request in the input file doesn't match the rest of the file.<br><br>Please ensure that all requests in the batch point to the same AOAI model deployment in the `model` property of the request.|
218+
|`invalid_request`| The schema of the input line is invalid or the deployment SKU is invalid. <br><br>Please ensure the properties of the request in your input file match the expected input properties, and that the Azure OpenAI deployment SKU is `globalbatch` for batch API requests.|
219+
220+
### Known issues
221+
222+
- Resources deployed with Azure CLI won't work out-of-box with Azure OpenAI global batch. This is due to an issue where resources deployed using this method have endpoint subdomains that don't follow the `https://your-resource-name.openai.azure.com` pattern. A workaround for this issue is to deploy a new Azure OpenAI resource using one of the other common deployment methods which will properly handle the subdomain setup as part of the deployment process.
223+
224+
225+
## See also
226+
227+
* Learn more about Azure OpenAI [deployment types](./deployment-types.md)
228+
* Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md)

0 commit comments

Comments
 (0)