Skip to content

Commit 0b14987

Browse files
authored
Merge pull request #801 from mrbullwinkle/mrb_10_12_2024_global_batch
[Azure OpenAI] Batch Preview Updates
2 parents 0f5d0d5 + 4c8c1cd commit 0b14987

File tree

7 files changed

+282
-24
lines changed

7 files changed

+282
-24
lines changed

articles/ai-services/openai/how-to/batch.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn how to use global batch with Azure OpenAI Service
55
manager: nitinme
66
ms.service: azure-ai-openai
7-
ms.custom:
7+
ms.custom: references_regions
88
ms.topic: how-to
99
ms.date: 10/14/2024
1010
author: mrbullwinkle
@@ -67,7 +67,7 @@ Refer to the [models page](../concepts/models.md) for the most up-to-date inform
6767

6868
### API support
6969

70-
API support was first added with `2024-07-01-preview`.
70+
API support was first added with `2024-07-01-preview`. Use `2024-10-01-preview` to take advantage of the latest features.
7171

7272
### Not supported
7373

@@ -90,9 +90,7 @@ In the Studio UI the deployment type will appear as `Global-Batch`.
9090
:::image type="content" source="../media/how-to/global-batch/global-batch.png" alt-text="Screenshot that shows the model deployment dialog in Azure OpenAI Studio with Global-Batch deployment type highlighted." lightbox="../media/how-to/global-batch/global-batch.png":::
9191

9292
> [!TIP]
93-
> Each line of your input file for batch processing has a `model` attribute that requires a global batch **deployment name**. For a given input file, all names must be the same deployment name. This is different from OpenAI where the concept of model deployments does not exist.
94-
>
95-
> For the best performance we recommend submitting large files for batch processing, rather than a large number of small files with only a few lines in each file.
93+
> We recommend enabling **dynamic quota** for all global batch model deployments to help avoid job failures due to insufficient enqueued token quota. Dynamic quota allows your deployment to opportunistically take advantage of more quota when extra capacity is available. When dynamic quota is set to off, your deployment will only be able to process requests up to the enqueued token limit that was defined when you created the deployment.
9694
9795
::: zone pivot="programming-language-ai-studio"
9896

@@ -161,6 +159,15 @@ Yes. Similar to other deployment types, you can create content filters and assoc
161159

162160
Yes, from the quota page in the Studio UI. Default quota allocation can be found in the [quota and limits article](../quotas-limits.md#global-batch-quota).
163161

162+
### How do I tell how many tokens my batch request contains, and how many tokens are available as quota?
163+
164+
The `2024-10-01-preview` REST API adds two new response headers:
165+
166+
* `deployment-enqueued-tokens` - A approximate token count for your jsonl file calculated immediately after the batch request is submitted. This value represents an estimate based on the number of characters and is not the true token count.
167+
* `deployment-maximum-enqueued-tokens` The total available enqueued tokens available for this global batch model deployment.
168+
169+
These response headers are only available when making a POST request to begin batch processing of a file with the REST API. The language specific client libraries do not currently return these new response headers.
170+
164171
### What happens if the API doesn't complete my request within the 24 hour time frame?
165172

166173
We aim to process these requests within 24 hours; we don't expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.
@@ -236,4 +243,4 @@ When a job failure occurs, you'll find details about the failure in the `errors`
236243
## See also
237244

238245
* Learn more about Azure OpenAI [deployment types](./deployment-types.md)
239-
* Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md)
246+
* Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md)

articles/ai-services/openai/includes/batch/batch-python.md

Lines changed: 236 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: Azure OpenAI model global batch Python
55
manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: include
8-
ms.date: 07/22/2024
8+
ms.date: 10/15/2024
99
---
1010

1111
## Prerequisites
@@ -63,6 +63,8 @@ The `custom_id` is required to allow you to identify which individual batch requ
6363

6464
> [!IMPORTANT]
6565
> The `model` attribute must be set to match the name of the Global Batch deployment you wish to target for inference responses. The **same Global Batch model deployment name must be present on each line of the batch file.** If you want to target a different deployment you must do so in a separate batch file/job.
66+
>
67+
> For the best performance we recommend submitting large files for batch processing, rather than a large number of small files with only a few lines in each file.
6668
6769
### Create input file
6870

@@ -74,13 +76,42 @@ Once your input file is prepared, you first need to upload the file to then be a
7476

7577
[!INCLUDE [Azure key vault](~/reusable-content/ce-skilling/azure/includes/ai-services/security/azure-key-vault.md)]
7678

79+
# [Python (Microsoft Entra ID)](#tab/python-secure)
80+
81+
```python
82+
import os
83+
from openai import AzureOpenAI
84+
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
85+
86+
token_provider = get_bearer_token_provider(
87+
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
88+
)
89+
90+
client = AzureOpenAI(
91+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
92+
azure_ad_token_provider=token_provider,
93+
api_version="2024-10-01-preview"
94+
)
95+
96+
# Upload a file with a purpose of "batch"
97+
file = client.files.create(
98+
file=open("test.jsonl", "rb"),
99+
purpose="batch"
100+
)
101+
102+
print(file.model_dump_json(indent=2))
103+
file_id = file.id
104+
```
105+
106+
# [Python (API Key)](#tab/python-key)
107+
77108
```python
78109
import os
79110
from openai import AzureOpenAI
80111

81112
client = AzureOpenAI(
82113
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
83-
api_version="2024-07-01-preview",
114+
api_version="2024-10-01-preview",
84115
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
85116
)
86117

@@ -94,6 +125,8 @@ print(file.model_dump_json(indent=2))
94125
file_id = file.id
95126
```
96127

128+
---
129+
97130
**Output:**
98131

99132
```json
@@ -367,3 +400,204 @@ List all batch jobs for a particular Azure OpenAI resource.
367400
```python
368401
client.batches.list()
369402
```
403+
404+
### List batch (Preview)
405+
406+
Use the REST API to list all batch jobs with additional sorting/filtering options.
407+
408+
In the examples below we are providing the `generate_time_filter` function to make constructing the filter easier. If you don't wish to use this function the format of the filter string would look like `created_at gt 1728773533 and created_at lt 1729032733 and status eq 'Completed'`.
409+
410+
# [Python (Microsoft Entra ID)](#tab/python-secure)
411+
412+
```python
413+
import requests
414+
import json
415+
from datetime import datetime, timedelta
416+
from azure.identity import DefaultAzureCredential
417+
418+
token_credential = DefaultAzureCredential()
419+
token = token_credential.get_token('https://cognitiveservices.azure.com/.default')
420+
421+
endpoint = "https://{YOUR_RESOURCE_NAME}.openai.azure.com/"
422+
api_version = "2024-10-01-preview"
423+
url = f"{endpoint}openai/batches"
424+
order = "created_at asc"
425+
time_filter = lambda: generate_time_filter("past 8 hours")
426+
427+
# Additional filter examples:
428+
#time_filter = lambda: generate_time_filter("past 1 day")
429+
#time_filter = lambda: generate_time_filter("past 3 days", status="Completed")
430+
431+
def generate_time_filter(time_range, status=None):
432+
now = datetime.now()
433+
434+
if 'day' in time_range:
435+
days = int(time_range.split()[1])
436+
start_time = now - timedelta(days=days)
437+
elif 'hour' in time_range:
438+
hours = int(time_range.split()[1])
439+
start_time = now - timedelta(hours=hours)
440+
else:
441+
raise ValueError("Invalid time range format. Use 'past X day(s)' or 'past X hour(s)'")
442+
443+
start_timestamp = int(start_time.timestamp())
444+
end_timestamp = int(now.timestamp())
445+
446+
filter_string = f"created_at gt {start_timestamp} and created_at lt {end_timestamp}"
447+
448+
if status:
449+
filter_string += f" and status eq '{status}'"
450+
451+
return filter_string
452+
453+
filter = time_filter()
454+
455+
headers = {'Authorization': 'Bearer ' + token.token}
456+
457+
params = {
458+
"api-version": api_version,
459+
"$filter": filter,
460+
"$orderby": order
461+
}
462+
463+
response = requests.get(url, headers=headers, params=params)
464+
465+
json_data = response.json()
466+
467+
if response.status_code == 200:
468+
print(json.dumps(json_data, indent=2))
469+
else:
470+
print(f"Request failed with status code: {response.status_code}")
471+
print(response.text)
472+
```
473+
474+
# [Python (API Key)](#tab/python-key)
475+
476+
```python
477+
import os
478+
import requests
479+
import json
480+
from datetime import datetime, timedelta
481+
482+
api_key = os.getenv("AZURE_OPENAI_API_KEY"),
483+
endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
484+
api_version = "2024-10-01-preview"
485+
url = f"{endpoint}openai/batches"
486+
order = "created_at asc"
487+
488+
time_filter = lambda: generate_time_filter("past 8 hours")
489+
490+
# Additional filter examples:
491+
#time_filter = lambda: generate_time_filter("past 1 day")
492+
#time_filter = lambda: generate_time_filter("past 3 days", status="Completed")
493+
494+
def generate_time_filter(time_range, status=None):
495+
now = datetime.now()
496+
497+
if 'day' in time_range:
498+
days = int(time_range.split()[1])
499+
start_time = now - timedelta(days=days)
500+
elif 'hour' in time_range:
501+
hours = int(time_range.split()[1])
502+
start_time = now - timedelta(hours=hours)
503+
else:
504+
raise ValueError("Invalid time range format. Use 'past X day(s)' or 'past X hour(s)'")
505+
506+
start_timestamp = int(start_time.timestamp())
507+
end_timestamp = int(now.timestamp())
508+
509+
filter_string = f"created_at gt {start_timestamp} and created_at lt {end_timestamp}"
510+
511+
if status:
512+
filter_string += f" and status eq '{status}'"
513+
514+
return filter_string
515+
516+
filter = time_filter()
517+
518+
headers = {
519+
"api-key": api_key
520+
}
521+
522+
params = {
523+
"api-version": api_version,
524+
"$filter": filter,
525+
"$orderby": order
526+
}
527+
528+
response = requests.get(url, headers=headers, params=params)
529+
530+
json_data = response.json()
531+
532+
if response.status_code == 200:
533+
print(json.dumps(json_data, indent=2))
534+
else:
535+
print(f"Request failed with status code: {response.status_code}")
536+
print(response.text)
537+
```
538+
539+
---
540+
541+
**Output:**
542+
543+
```output
544+
{
545+
"data": [
546+
{
547+
"cancelled_at": null,
548+
"cancelling_at": null,
549+
"completed_at": 1729011896,
550+
"completion_window": "24h",
551+
"created_at": 1729011128,
552+
"error_file_id": "file-472c0626-4561-4327-9e4e-f41afbfb30e6",
553+
"expired_at": null,
554+
"expires_at": 1729097528,
555+
"failed_at": null,
556+
"finalizing_at": 1729011805,
557+
"id": "batch_4ddc7b60-19a9-419b-8b93-b9a3274b33b5",
558+
"in_progress_at": 1729011493,
559+
"input_file_id": "file-f89384af0082485da43cb26b49dc25ce",
560+
"errors": null,
561+
"metadata": null,
562+
"object": "batch",
563+
"output_file_id": "file-62bebde8-e767-4cd3-a0a1-28b214dc8974",
564+
"request_counts": {
565+
"total": 3,
566+
"completed": 2,
567+
"failed": 1
568+
},
569+
"status": "completed",
570+
"endpoint": "/chat/completions"
571+
},
572+
{
573+
"cancelled_at": null,
574+
"cancelling_at": null,
575+
"completed_at": 1729016366,
576+
"completion_window": "24h",
577+
"created_at": 1729015829,
578+
"error_file_id": "file-85ae1971-9957-4511-9eb4-4cc9f708b904",
579+
"expired_at": null,
580+
"expires_at": 1729102229,
581+
"failed_at": null,
582+
"finalizing_at": 1729016272,
583+
"id": "batch_6287485f-50fc-4efa-bcc5-b86690037f43",
584+
"in_progress_at": 1729016126,
585+
"input_file_id": "file-686746fcb6bc47f495250191ffa8a28e",
586+
"errors": null,
587+
"metadata": null,
588+
"object": "batch",
589+
"output_file_id": "file-04399828-ae0b-4825-9b49-8976778918cb",
590+
"request_counts": {
591+
"total": 3,
592+
"completed": 2,
593+
"failed": 1
594+
},
595+
"status": "completed",
596+
"endpoint": "/chat/completions"
597+
}
598+
],
599+
"first_id": "batch_4ddc7b60-19a9-419b-8b93-b9a3274b33b5",
600+
"has_more": false,
601+
"last_id": "batch_6287485f-50fc-4efa-bcc5-b86690037f43"
602+
}
603+
```

0 commit comments

Comments
 (0)