|
| 1 | +--- |
| 2 | +title: 'How to use global batch processing with Azure OpenAI Service' |
| 3 | +titleSuffix: Azure OpenAI |
| 4 | +description: Learn how to use global batch with Azure OpenAI Service |
| 5 | +manager: nitinme |
| 6 | +ms.service: azure-ai-openai |
| 7 | +ms.custom: |
| 8 | +ms.topic: how-to |
| 9 | +ms.date: 08/04/2024 |
| 10 | +author: mrbullwinkle |
| 11 | +ms.author: mbullwin |
| 12 | +recommendations: false |
| 13 | +zone_pivot_groups: openai-fine-tuning-batch |
| 14 | +--- |
| 15 | + |
| 16 | +# Getting started with Azure OpenAI global batch deployments (preview) |
| 17 | + |
| 18 | +The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at [50% less cost than global standard](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads. |
| 19 | + |
| 20 | +Key use cases include: |
| 21 | + |
| 22 | +* **Large-Scale Data Processing:** Quickly analyze extensive datasets in parallel. |
| 23 | + |
| 24 | +* **Content Generation:** Create large volumes of text, such as product descriptions or articles. |
| 25 | + |
| 26 | +* **Document Review and Summarization:** Automate the review and summarization of lengthy documents. |
| 27 | + |
| 28 | +* **Customer Support Automation:** Handle numerous queries simultaneously for faster responses. |
| 29 | + |
| 30 | +* **Data Extraction and Analysis:** Extract and analyze information from vast amounts of unstructured data. |
| 31 | + |
| 32 | +* **Natural Language Processing (NLP) Tasks:** Perform tasks like sentiment analysis or translation on large datasets. |
| 33 | + |
| 34 | +* **Marketing and Personalization:** Generate personalized content and recommendations at scale. |
| 35 | + |
| 36 | +> [!IMPORTANT] |
| 37 | +> We aim to process batch requests within 24 hours; we do not expire the jobs that take longer. You can [cancel](#cancel-batch) the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You will be charged for any completed work. |
| 38 | +> |
| 39 | +> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/). |
| 40 | +
|
| 41 | +## Global batch support |
| 42 | + |
| 43 | +### Region and model support |
| 44 | + |
| 45 | +Global batch is currently supported in the following regions: |
| 46 | + |
| 47 | +- East US |
| 48 | +- West US |
| 49 | +- Sweden Central |
| 50 | + |
| 51 | +The following models support global batch: |
| 52 | + |
| 53 | +| Model | Version | Supported | |
| 54 | +|---|---| |
| 55 | +|`gpt-4o` | 2024-05-13 |Yes (text + vision) | |
| 56 | +|`gpt-4` | turbo-2024-04-09 | Yes (text only) | |
| 57 | +|`gpt-4` | 0613 | Yes | |
| 58 | +| `gpt-35-turbo` | 0125 | Yes | |
| 59 | +| `gpt-35-turbo` | 1106 | Yes | |
| 60 | +| `gpt-35-turbo` | 0613 | Yes | |
| 61 | + |
| 62 | + |
| 63 | +Refer to the [models page](../concepts/models.md) for the most up-to-date information on regions/models where global batch is currently supported. |
| 64 | + |
| 65 | +### API Versions |
| 66 | + |
| 67 | +- `2024-07-01-preview` |
| 68 | + |
| 69 | +### Not supported |
| 70 | + |
| 71 | +The following aren't currently supported: |
| 72 | + |
| 73 | +- Integration with the Assistants API. |
| 74 | +- Integration with Azure OpenAI On Your Data feature. |
| 75 | + |
| 76 | +### Global batch deployment |
| 77 | + |
| 78 | +In the Studio UI the deployment type will appear as `Global-Batch`. |
| 79 | + |
| 80 | +:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure OpenAI Studio with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png"::: |
| 81 | + |
| 82 | +> [!TIP] |
| 83 | +> Each line of your input file for batch processing has a `model` attribute that requires a global batch **deployment name**. For a given input file, all names must be the same deployment name. This is different from OpenAI where the concept of model deployments does not exist. |
| 84 | +
|
| 85 | +::: zone pivot="programming-language-ai-studio" |
| 86 | + |
| 87 | +[!INCLUDE [Studio](../includes/batch/batch-studio.md)] |
| 88 | + |
| 89 | +::: zone-end |
| 90 | + |
| 91 | +::: zone pivot="programming-language-python" |
| 92 | + |
| 93 | +[!INCLUDE [Python](../includes/batch/batch-python.md)] |
| 94 | + |
| 95 | +::: zone-end |
| 96 | + |
| 97 | +::: zone pivot="rest-api" |
| 98 | + |
| 99 | +[!INCLUDE [REST](../includes/batch/batch-rest.md)] |
| 100 | + |
| 101 | +::: zone-end |
| 102 | + |
| 103 | +[!INCLUDE [Quota](../includes/global-batch-limits.md)] |
| 104 | + |
| 105 | +## Batch object |
| 106 | + |
| 107 | +|Property | Type | Definition| |
| 108 | +|---|---|---| |
| 109 | +| `id` | string | | |
| 110 | +| `object` | string| `batch` | |
| 111 | +| `endpoint` | string | The API endpoint used by the batch | |
| 112 | +| `errors` | object | | |
| 113 | +| `input_file_id` | string | The ID of the input file for the batch | |
| 114 | +| `completion_window` | string | The time frame within which the batch should be processed | |
| 115 | +| `status` | string | The current status of the batch. Possible values: `validating`, `failed`, `in_progress`, `finalizing`, `completed`, `expired`, `cancelling`, `cancelled`. | |
| 116 | +| `output_file_id` | string |The ID of the file containing the outputs of successfully executed requests. | |
| 117 | +| `error_file_id` | string | The ID of the file containing the outputs of requests with errors. | |
| 118 | +| `created_at` | integer | A timestamp when this batch was created (in unix epochs). | |
| 119 | +| `in_progress_at` | integer | A timestamp when this batch started progressing (in unix epochs). | |
| 120 | +| `expires_at` | integer | A timestamp when this batch will expire (in unix epochs). | |
| 121 | +| `finalizing_at` | integer | A timestamp when this batch started finalizing (in unix epochs). | |
| 122 | +| `completed_at` | integer | A timestamp when this batch started finalizing (in unix epochs). | |
| 123 | +| `failed_at` | integer | A timestamp when this batch failed (in unix epochs) | |
| 124 | +| `expired_at` | integer | A timestamp when this batch expired (in unix epochs).| |
| 125 | +| `cancelling_at` | integer | A timestamp when this batch started `cancelling` (in unix epochs). | |
| 126 | +| `cancelled_at` | integer | A timestamp when this batch was `cancelled` (in unix epochs). | |
| 127 | +| `request_counts` | object | Object structure:<br><br> `total` *integer* <br> The total number of requests in the batch. <br>`completed` *integer* <br> The number of requests in the batch that have been completed successfully. <br> `failed` *integer* <br> The number of requests in the batch that have failed. |
| 128 | +| `metadata` | map | A set of key-value pairs that can be attached to the batch. This property can be useful for storing additional information about the batch in a structured format. | |
| 129 | + |
| 130 | +## Frequently asked questions (FAQ) |
| 131 | + |
| 132 | +### Can images be used with the batch API? |
| 133 | + |
| 134 | +This capability is limited to certain multi-modal models. Currently only GPT-4o support images as part of batch requests. Images can be provided as input either via [image url or a base64 encoded representation of the image](#input-format). Images for batch are currently not supported with GPT-4 Turbo. |
| 135 | + |
| 136 | +### Can I use the batch API with fine-tuned models? |
| 137 | + |
| 138 | +This is currently not supported. |
| 139 | + |
| 140 | +### Can I use the batch API for embeddings models? |
| 141 | + |
| 142 | +This is currently not supported. |
| 143 | + |
| 144 | +### Does content filtering work with Global Batch deployment? |
| 145 | + |
| 146 | +Yes. Similar to other deployment types, you can create content filters and associate them with the Global Batch deployment type. |
| 147 | + |
| 148 | +### Can I request additional quota? |
| 149 | + |
| 150 | +Yes, from the quota page in the Studio UI. Default quota allocation can be found in the [quota and limits article](../quotas-limits.md#global-batch-quota). |
| 151 | + |
| 152 | +### What happens if the API doesn't complete my request within the 24 hour time frame? |
| 153 | + |
| 154 | +We aim to process these requests within 24 hours; we don't expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work. |
| 155 | + |
| 156 | +### How many requests can I queue using batch? |
| 157 | + |
| 158 | +There's no fixed limit on the number of requests you can batch, however, it will depend on your enqueued token quota. Your enqueued token quota includes the maximum number of input tokens you can enqueue at one time. |
| 159 | + |
| 160 | +Once your batch request is completed, your batch rate limit is reset, as your input tokens are cleared. The limit depends on the number of global requests in the queue. If the Batch API queue processes your batches quickly, your batch rate limit is reset more quickly. |
| 161 | + |
| 162 | +## Troubleshooting |
| 163 | + |
| 164 | +A job is successful when `status` is `Completed`. Successful jobs will still generate an error_file_id, but it will be associated with an empty file with zero bytes. |
| 165 | + |
| 166 | +When a job failure occurs, you'll find details about the failure in the `errors` property: |
| 167 | + |
| 168 | +```json |
| 169 | +"value": [ |
| 170 | + { |
| 171 | + "cancelled_at": null, |
| 172 | + "cancelling_at": null, |
| 173 | + "completed_at": "2024-06-27T06:50:01.6603753+00:00", |
| 174 | + "completion_window": null, |
| 175 | + "created_at": "2024-06-27T06:37:07.3746615+00:00", |
| 176 | + "error_file_id": "file-f13a58f6-57c7-44d6-8ceb-b89682588072", |
| 177 | + "expired_at": null, |
| 178 | + "expires_at": "2024-06-28T06:37:07.3163459+00:00", |
| 179 | + "failed_at": null, |
| 180 | + "finalizing_at": "2024-06-27T06:49:59.1994732+00:00", |
| 181 | + "id": "batch_50fa47a0-ef19-43e5-9577-a4679b92faff", |
| 182 | + "in_progress_at": "2024-06-27T06:39:57.455977+00:00", |
| 183 | + "input_file_id": "file-42147e78ea42488682f4fd1d73028e72", |
| 184 | + "errors": { |
| 185 | + "object": “list”, |
| 186 | + "data": [ |
| 187 | + { |
| 188 | + “code”: “empty_file”, |
| 189 | + “message”: “The input file is empty. Please ensure that the batch contains at least one request.” |
| 190 | + } |
| 191 | + ] |
| 192 | + }, |
| 193 | + "metadata": null, |
| 194 | + "object": "batch", |
| 195 | + "output_file_id": "file-22d970b7-376e-4223-a307-5bb081ea24d7", |
| 196 | + "request_counts": { |
| 197 | + "total": 10, |
| 198 | + "completed": null, |
| 199 | + "failed": null |
| 200 | + }, |
| 201 | + "status": "Failed" |
| 202 | + } |
| 203 | +``` |
| 204 | + |
| 205 | +### Error codes |
| 206 | + |
| 207 | +|Error code | Definition| |
| 208 | +|---|---| |
| 209 | +|`invalid_json_line`| A line (or multiple) in your input file wasn't able to be parsed as valid json.<br><br> Please ensure no typos, proper opening and closing brackets, and quotes as per JSON standard, and resubmit the request.| |
| 210 | +| `too_many_tasks` |The number of requests in the input file exceeds the maximum allowed value of 100,000.<br><br>Please ensure your total requests are under 100,000 and resubmit the job.| |
| 211 | +| `url_mismatch` | Either a row in your input file has a URL that doesn’t match the rest of the rows, or the URL specified in the input file doesn’t match the expected endpoint URL. <br><br>Please ensure all request URLs are the same, and that they match the endpoint URL associated with your Azure OpenAI deployment.| |
| 212 | +|`model_not_found`|The Azure OpenAI model deployment name that was specified in the `model` property of the input file wasn't found.<br><br> Please ensure this name points to a valid Azure OpenAI model deployment.| |
| 213 | +| `duplicate_custom_id` | The custom ID for this request is a duplicate of the custom ID in another request. | |
| 214 | +|`empty_batch` | Please check your input file to ensure that the custom ID parameter is unique for each request in the batch.| |
| 215 | +|`model_mismatch`| The Azure OpenAI model deployment name that was specified in the `model` property of this request in the input file doesn't match the rest of the file.<br><br>Please ensure that all requests in the batch point to the same AOAI model deployment in the `model` property of the request.| |
| 216 | +|`invalid_request`| The schema of the input line is invalid or the deployment SKU is invalid. <br><br>Please ensure the properties of the request in your input file match the expected input properties, and that the Azure OpenAI deployment SKU is `globalbatch` for batch API requests.| |
| 217 | + |
| 218 | +### Known issues |
| 219 | + |
| 220 | +- Resources deployed with Azure CLI won't work out-of-box with Azure OpenAI global batch. This is due to an issue where resources deployed using this method have endpoint subdomains that don't follow the `https://your-resource-name.openai.azure.com` pattern. A workaround for this issue is to deploy a new Azure OpenAI resource using one of the other common deployment methods which will properly handle the subdomain setup as part of the deployment process. |
| 221 | + |
| 222 | + |
| 223 | +## See also |
| 224 | + |
| 225 | +* Learn more about Azure OpenAI [deployment types](./deployment-types.md) |
| 226 | +* Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md) |
0 commit comments