You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/batch.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -83,6 +83,8 @@ In the Studio UI the deployment type will appear as `Global-Batch`.
83
83
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure OpenAI Studio with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
84
84
85
85
> [!TIP]
86
+
> We recommend enabling **dynamic quota** for all global batch model deployments to help avoid job failures due to insufficient enqueued token quota.
87
+
>
86
88
> Each line of your input file for batch processing has a `model` attribute that requires a global batch **deployment name**. For a given input file, all names must be the same deployment name. This is different from OpenAI where the concept of model deployments does not exist.
87
89
>
88
90
> For the best performance we recommend submitting large files for batch processing, rather than a large number of small files with only a few lines in each file.
@@ -154,6 +156,15 @@ Yes. Similar to other deployment types, you can create content filters and assoc
154
156
155
157
Yes, from the quota page in the Studio UI. Default quota allocation can be found in the [quota and limits article](../quotas-limits.md#global-batch-quota).
156
158
159
+
### How do I tell how many tokens my batch request contains, and how many tokens are available as quota?
160
+
161
+
The `2024-10-01-preview` REST API adds two new response headers:
162
+
163
+
*`deployment-enqueued-tokens` - A approximate token count for your jsonl file calculating immediately after the batch request is submitted. This value represents an estimate based on the number of characters and is not the true token count.
164
+
*`deployment-maximum-enqueued-tokens` The total available enqueued tokens available for this global batch model deployment.
165
+
166
+
These response headers are only available when making a POST request to begin batch processing of a file with the REST API. The language specific client libraries do not currently return these new response headers.
167
+
157
168
### What happens if the API doesn't complete my request within the 24 hour time frame?
158
169
159
170
We aim to process these requests within 24 hours; we don't expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.
0 commit comments