Skip to content

Commit d7b3539

Browse files
committed
update
1 parent d23a81f commit d7b3539

File tree

1 file changed

+11
-0
lines changed
  • articles/ai-services/openai/how-to

1 file changed

+11
-0
lines changed

articles/ai-services/openai/how-to/batch.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ In the Studio UI the deployment type will appear as `Global-Batch`.
8383
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure OpenAI Studio with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
8484

8585
> [!TIP]
86+
> We recommend enabling **dynamic quota** for all global batch model deployments to help avoid job failures due to insufficient enqueued token quota.
87+
>
8688
> Each line of your input file for batch processing has a `model` attribute that requires a global batch **deployment name**. For a given input file, all names must be the same deployment name. This is different from OpenAI where the concept of model deployments does not exist.
8789
>
8890
> For the best performance we recommend submitting large files for batch processing, rather than a large number of small files with only a few lines in each file.
@@ -154,6 +156,15 @@ Yes. Similar to other deployment types, you can create content filters and assoc
154156

155157
Yes, from the quota page in the Studio UI. Default quota allocation can be found in the [quota and limits article](../quotas-limits.md#global-batch-quota).
156158

159+
### How do I tell how many tokens my batch request contains, and how many tokens are available as quota?
160+
161+
The `2024-10-01-preview` REST API adds two new response headers:
162+
163+
* `deployment-enqueued-tokens` - A approximate token count for your jsonl file calculating immediately after the batch request is submitted. This value represents an estimate based on the number of characters and is not the true token count.
164+
* `deployment-maximum-enqueued-tokens` The total available enqueued tokens available for this global batch model deployment.
165+
166+
These response headers are only available when making a POST request to begin batch processing of a file with the REST API. The language specific client libraries do not currently return these new response headers.
167+
157168
### What happens if the API doesn't complete my request within the 24 hour time frame?
158169

159170
We aim to process these requests within 24 hours; we don't expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.

0 commit comments

Comments
 (0)