Skip to content

Commit ee427af

Browse files
committed
update
1 parent 60beb4e commit ee427af

File tree

11 files changed

+327
-16
lines changed

11 files changed

+327
-16
lines changed
Lines changed: 315 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,315 @@
1+
---
2+
title: 'How to configure Azure Blob Storage with Azure OpenAI Batch'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to configure Azure Blob Storage with Azure OpenAI Batch
5+
manager: nitinme
6+
ms.service: azure-ai-openai
7+
ms.custom: references_regions
8+
ms.topic: how-to
9+
ms.date: 05/18/2025
10+
author: mrbullwinkle
11+
ms.author: mbullwin
12+
recommendations: false
13+
zone_pivot_groups: openai-fine-tuning-batch
14+
---
15+
16+
# Configuring Azure Blob Storage for Azure OpenAI
17+
18+
Azure OpenAI now supports using [Azure Blob Storage](/azure/storage/blobs/storage-blobs-introduction) for Azure OpenAI Batch input and output files. By using your own storage, you aren't subject to batch file limits.
19+
20+
## Region Support
21+
22+
- australiaeast
23+
- eastus
24+
- germanywestcentral
25+
- northcentralus
26+
- polandcentral
27+
- swedencentral
28+
- switzerlandnorth
29+
- eastus2
30+
- westus
31+
32+
## Azure Blob Storage configuration
33+
34+
### Prerequisites
35+
36+
- An [Azure Blob Storage account](/azure/storage/blobs/storage-blobs-introduction).
37+
- An Azure OpenAI resource with a model of the deployment type `Global-Batch` or `DataZoneBatch` deployed. You can refer to the [resource creation and model deployment guide](../../how-to/create-resource.md) for help with this process.
38+
39+
### Managed identity
40+
41+
In order for your Azure OpenAI resource to securely access your Azure Blob Storage account you need setup your resource with a **system assigned managed identity**.
42+
43+
> [!NOTE]
44+
> Currently user assigned managed identities aren't supported.
45+
46+
1. Sign in to [https://portal.azure.com](https://portal.azure.com).
47+
2. Find your Azure OpenAI resource > Select **Resource Management** > **Identity**. **System assigned** > set status to **On**.
48+
49+
:::image type="content" source="../media/how-to/batch-blob-storage/identity.png" alt-text="Screenshot that shows system managed identity configuration." lightbox="../media/how-to/batch-blob-storage/identity.png":::
50+
51+
### Role-based access control
52+
53+
Once your Azure OpenAI resource has been configured for system assigned managed identity, you need to give it access to your Azure Blob Storage account.
54+
55+
1. From [https://portal.azure.com](https://portal.azure.com) find and select your Azure Blob Storage resource.
56+
2. Select **Access Control (IAM)** > **Add** > **Add role assignment**.
57+
58+
:::image type="content" source="../media/how-to/batch-blob-storage/access-control.png" alt-text="Screenshot that shows access control interface for an Azure Blob Storage resource." lightbox="../media/how-to/batch-blob-storage/access-control.png":::
59+
60+
3. Search for **Storage Blob Data Contributor** > **Next**.
61+
4. Select **Managed identity** > **+Select members** > Select your Azure OpenAI resources's managed identity.
62+
63+
:::image type="content" source="../media/how-to/batch-blob-storage/add-role.png" alt-text="Screenshot that shows Storage Blob Data Contributor role assignment." lightbox="../media/how-to/batch-blob-storage/add-role.png":::
64+
65+
If you prefer using custom roles for more granular access, the following permissions are required:
66+
67+
**Input data**:
68+
69+
- `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read`
70+
71+
**Output data/folders**:
72+
73+
- `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read`
74+
- `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write`
75+
76+
### Create containers
77+
78+
For this example you'll create two containers named `batch-input`, and `batch-output`. You can name these whatever you want, but if you use an alternate name you'll need to adjust the examples in the following steps.
79+
80+
To create a container under **Data storage** > Select **+Container** > Name your containers.
81+
82+
:::image type="content" source="../media/how-to/batch-blob-storage/container.png" alt-text="Screenshot that shows Storage Blob Data Contributor role assignment." lightbox="../media/how-to/batch-blob-storage/container.png":::
83+
84+
Once your containers are created retrieve the URL for each container by selecting the container > **Settings** > **Properties** > Copy the URLs.
85+
86+
In this case we have:
87+
88+
`https://{AZURE-BLOB-STORAGE-RESOURCE-NAME}.blob.core.windows.net/batch-input`
89+
`https://{AZURE-BLOB-STORAGE-RESOURCE-NAME}.blob.core.windows.net/batch-output`
90+
91+
### Create input file
92+
93+
For this article, we'll create a file named `test.jsonl` and will copy the contents below to the file. You'll need to modify and add your global batch deployment name to each line of the file.
94+
95+
```json
96+
{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
97+
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
98+
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}
99+
```
100+
101+
### Upload training file
102+
103+
From your Azure Blob Storage account, open your **batch-input** container that you created previously.
104+
105+
Select **Upload** and select your `test.jsonl` file.
106+
107+
:::image type="content" source="../media/how-to/batch-blob-storage/upload.png" alt-text="Screenshot that shows Azure Storage Blob container upload UX." lightbox="../media/how-to/batch-blob-storage/upload.png":::
108+
109+
## Create batch job
110+
111+
# [Python](#tab/python)
112+
113+
```python
114+
import os
115+
from datetime import datetime
116+
from openai import AzureOpenAI
117+
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
118+
119+
token_provider = get_bearer_token_provider(
120+
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
121+
)
122+
123+
client = AzureOpenAI(
124+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
125+
azure_ad_token_provider=token_provider,
126+
api_version="2025-04-01-preview"
127+
)
128+
129+
batch_response = client.batches.create(
130+
input_file_id=None,
131+
endpoint="/chat/completions",
132+
completion_window="24h",
133+
extra_body={
134+
"input_blob": "https://{AZURE-BLOB-STORAGE-RESOURCE-NAME}.blob.core.windows.net/batch-input/test.jsonl",
135+
"output_folder": {
136+
"url": "https://{AZURE-BLOB-STORAGE-RESOURCE-NAME}.blob.core.windows.net/batch-output",
137+
}
138+
}
139+
)
140+
141+
# Save batch ID for later use
142+
batch_id = batch_response.id
143+
144+
print(batch_response.model_dump_json(indent=2))
145+
146+
```
147+
148+
# [REST ](#tab/rest)
149+
150+
```HTTP
151+
curl -X POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/batches?api-version=2025-04-01-preview \
152+
-H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
153+
-H "Content-Type: application/json" \
154+
-d '{
155+
"input_file_id": null,
156+
"endpoint": "/chat/completions",
157+
"completion_window": "24h",
158+
"input_blob": "https://{AZURE-BLOB-STORAGE-RESOURCE-NAME}.blob.core.windows.net/batch-input/test.jsonl",
159+
"output_folder": {
160+
"url": "https://{AZURE-BLOB-STORAGE-RESOURCE-NAME}.blob.core.windows.net/batch-output"
161+
}
162+
}'
163+
```
164+
165+
---
166+
167+
**Output:**
168+
169+
```json
170+
{
171+
"id": "batch_b632a805-797b-49ed-9c9c-86eb4057f2a2",
172+
"completion_window": "24h",
173+
"created_at": 1747516485,
174+
"endpoint": "/chat/completions",
175+
"input_file_id": null,
176+
"object": "batch",
177+
"status": "validating",
178+
"cancelled_at": null,
179+
"cancelling_at": null,
180+
"completed_at": null,
181+
"error_file_id": null,
182+
"errors": null,
183+
"expired_at": null,
184+
"expires_at": 1747602881,
185+
"failed_at": null,
186+
"finalizing_at": null,
187+
"in_progress_at": null,
188+
"metadata": null,
189+
"output_file_id": null,
190+
"request_counts": {
191+
"completed": 0,
192+
"failed": 0,
193+
"total": 0
194+
},
195+
"error_blob": "",
196+
"input_blob": "https://docstest002.blob.core.windows.net/batch-input/test.jsonl",
197+
"output_blob": ""
198+
}
199+
```
200+
201+
You can monitor the status the same way you would previously as outlined in our [comprehensive guide on using Azure OpenAI batch](./batch.md).
202+
203+
```python
204+
import time
205+
import datetime
206+
207+
status = "validating"
208+
while status not in ("completed", "failed", "canceled"):
209+
time.sleep(60)
210+
batch_response = client.batches.retrieve(batch_id)
211+
status = batch_response.status
212+
print(f"{datetime.datetime.now()} Batch Id: {batch_id}, Status: {status}")
213+
214+
if batch_response.status == "failed":
215+
for error in batch_response.errors.data:
216+
print(f"Error code {error.code} Message {error.message}")
217+
```
218+
219+
**Output:**
220+
221+
```cmd
222+
2025-05-17 17:16:56.950427 Batch Id: batch_b632a805-797b-49ed-9c9c-86eb4057f2a2, Status: validating
223+
2025-05-17 17:17:57.532054 Batch Id: batch_b632a805-797b-49ed-9c9c-86eb4057f2a2, Status: validating
224+
2025-05-17 17:18:58.156793 Batch Id: batch_b632a805-797b-49ed-9c9c-86eb4057f2a2, Status: in_progress
225+
2025-05-17 17:19:58.739708 Batch Id: batch_b632a805-797b-49ed-9c9c-86eb4057f2a2, Status: in_progress
226+
2025-05-17 17:20:59.398508 Batch Id: batch_b632a805-797b-49ed-9c9c-86eb4057f2a2, Status: finalizing
227+
2025-05-17 17:22:00.242371 Batch Id: batch_b632a805-797b-49ed-9c9c-86eb4057f2a2, Status: completed
228+
```
229+
230+
Once the `status` is `completed` you can retrieve your `output_blob` path:
231+
232+
```python
233+
print(batch_response.model_dump_json(indent=2))
234+
```
235+
236+
**Output:**
237+
238+
```json
239+
{
240+
"id": "batch_b632a805-797b-49ed-9c9c-86eb4057f2a2",
241+
"completion_window": "24h",
242+
"created_at": 1747516485,
243+
"endpoint": "/chat/completions",
244+
"input_file_id": null,
245+
"object": "batch",
246+
"status": "completed",
247+
"cancelled_at": null,
248+
"cancelling_at": null,
249+
"completed_at": 1747516883,
250+
"error_file_id": null,
251+
"errors": null,
252+
"expired_at": null,
253+
"expires_at": 1747602881,
254+
"failed_at": null,
255+
"finalizing_at": 1747516834,
256+
"in_progress_at": 1747516722,
257+
"metadata": null,
258+
"output_file_id": null,
259+
"request_counts": {
260+
"completed": 3,
261+
"failed": 0,
262+
"total": 3
263+
},
264+
"error_blob": "https://docstest002.blob.core.windows.net/batch-output/{GUID}/errors.jsonl",
265+
"input_blob": "https://docstest002.blob.core.windows.net/batch-input/test.jsonl",
266+
"output_blob": "https://docstest002.blob.core.windows.net/batch-output/{GUID}/results.jsonl"
267+
}
268+
```
269+
270+
Once your batch job is complete, you can download the `error_blob` and `output_blob` via the Azure Blob Storage interface in the Azure portal or you can download programmatically:
271+
272+
> [!NOTE]
273+
> There's a known issue where a `error_blob` url path is generated in the response even in cases where no errors occurred. When this happens the `error.jsonl` path will be invalid and the referenced file won't exist.
274+
275+
```cmd
276+
pip install azure-identity azure-storage-blob
277+
```
278+
279+
Keep in mind that while you have granted the Azure OpenAI resource programmatic access to your Azure Blob Storage, to download the results you might need to also give the user account that is executing the script below access as well. For downloading the file, `Storage Blob Data Reader` access is sufficient.
280+
281+
```python
282+
# Import required libraries
283+
from azure.identity import DefaultAzureCredential
284+
from azure.storage.blob import BlobServiceClient
285+
286+
# Define storage account and container information
287+
storage_account_name = "docstest002" # replace with your storage account name
288+
container_name = "batch-output"
289+
290+
# Define the blob paths to download
291+
blob_paths = [
292+
"{REPLACE-WITH-YOUR-GUID}/results.jsonl",
293+
]
294+
295+
credential = DefaultAzureCredential()
296+
account_url = f"https://{storage_account_name}.blob.core.windows.net"
297+
blob_service_client = BlobServiceClient(account_url=account_url, credential=credential)
298+
container_client = blob_service_client.get_container_client(container_name)
299+
300+
for blob_path in blob_paths:
301+
blob_client = container_client.get_blob_client(blob_path)
302+
303+
file_name = blob_path.split("/")[-1]
304+
305+
print(f"Downloading {file_name}...")
306+
with open(file_name, "wb") as file:
307+
download_stream = blob_client.download_blob()
308+
file.write(download_stream.readall())
309+
310+
print(f"Downloaded {file_name} successfully!")
311+
```
312+
313+
## See also
314+
315+
For more information on Azure OpenAI Batch, see the [comprehensive batch guide](./batch.md).

articles/ai-services/openai/how-to/batch.md

Lines changed: 1 addition & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -69,20 +69,13 @@ The following models support global batch:
6969
|`gpt-4o` | 2024-08-06 |text + image |
7070
|`gpt-4o-mini`| 2024-07-18 | text + image |
7171
|`gpt-4o` | 2024-05-13 |text + image |
72-
|`gpt-4` | turbo-2024-04-09 | text |
73-
|`gpt-4` | 0613 | text |
74-
| `gpt-35-turbo` | 0125 | text |
75-
| `gpt-35-turbo` | 1106 | text |
76-
| `gpt-35-turbo` | 0613 | text |
77-
78-
Refer to the [models page](../concepts/models.md) for the most up-to-date information on regions/models where global batch is currently supported.
7972

8073
### API support
8174

8275
| | API Version |
8376
|---|---|
8477
|**Latest GA API release:**| `2024-10-21`|
85-
|**Latest Preview API release:**| `2025-03-01-preview`|
78+
|**Latest Supported Preview API release:**| `2025-04-01-preview`|
8679

8780
> [!NOTE]
8881
> While Global Batch supports older API versions, some models require newer preview API versions. For example, `o3-mini` isn't supported with `2024-10-21` since it was released after this date. To access the newer models with global batch use the latest preview API version.
@@ -94,9 +87,6 @@ The following aren't currently supported:
9487
- Integration with the Assistants API.
9588
- Integration with Azure OpenAI On Your Data feature.
9689

97-
> [!NOTE]
98-
> Structured outputs is now supported with Global Batch.
99-
10090
### Batch deployment
10191

10292
> [!NOTE]

articles/ai-services/openai/includes/batch/batch-python.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ For this article we'll create a file named `test.jsonl` and will copy the conten
7979

8080
## Upload batch file
8181

82-
Once your input file is prepared, you first need to upload the file to then be able to kick off a batch job. File upload can be done both programmatically or via the Studio. This example uses environment variables in place of the key and endpoint values. If you're unfamiliar with using environment variables with Python refer to one of our [quickstarts](../../chatgpt-quickstart.md) where the process of setting up the environment variables in explained step-by-step.
82+
Once your input file is prepared, you first need to upload the file to then be able to initiate a batch job. File upload can be done both programmatically or via the Azure AI Foundry portal. This example demonstrates uploading a file directly to your Azure OpenAI resource. Alternatively, you can [configure Azure Blob Storage for Azure OpenAI Batch](../../how-to/batch-blob-storage.md).
8383

8484
# [Python (Microsoft Entra ID)](#tab/python-secure)
8585

@@ -95,7 +95,7 @@ token_provider = get_bearer_token_provider(
9595
client = AzureOpenAI(
9696
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
9797
azure_ad_token_provider=token_provider,
98-
api_version="2025-03-01-preview"
98+
api_version="2025-04-01-preview"
9999
)
100100

101101
# Upload a file with a purpose of "batch"

articles/ai-services/openai/includes/batch/batch-rest.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ For this article we'll create a file named `test.jsonl` and will copy the conten
6969

7070
## Upload batch file
7171

72-
Once your input file is prepared, you first need to upload the file to then be able to kick off a batch job. File upload can be done both programmatically or via the Studio. This example uses environment variables in place of the key and endpoint values. If you're unfamiliar with using environment variables with Python refer to one of our [quickstarts](../../chatgpt-quickstart.md) where the process of setting up the environment variables in explained step-by-step.
72+
Once your input file is prepared, you first need to upload the file to then be able to initiate a batch job. File upload can be done both programmatically or via the Azure AI Foundry portal. This example demonstrates uploading a file directly to your Azure OpenAI resource. Alternatively, you can [configure Azure Blob Storage for Azure OpenAI Batch](../../how-to/batch-blob-storage.md).
7373

7474
[!INCLUDE [Azure key vault](~/reusable-content/ce-skilling/azure/includes/ai-services/security/azure-key-vault.md)]
7575

articles/ai-services/openai/includes/batch/batch-studio.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,8 @@ For this article, we'll create a file named `test.jsonl` and will copy the conte
6969

7070
## Upload batch file
7171

72-
Once your input file is prepared, you first need to upload the file to then be able to kick off a batch job. File upload can be done both programmatically or via the Studio.
72+
Once your input file is prepared, you first need to upload the file to then be able to initiate a batch job. File upload can be done both programmatically or via the Azure AI Foundry portal. This example demonstrates uploading a file directly to your Azure OpenAI resource. Alternatively, you can [configure Azure Blob Storage for Azure OpenAI Batch](../../how-to/batch-blob-storage.md).
73+
7374

7475
1. Sign in to [Azure AI Foundry portal](https://ai.azure.com).
7576
2. Select the Azure OpenAI resource where you have a global batch model deployment available.
@@ -81,6 +82,7 @@ Once your input file is prepared, you first need to upload the file to then be a
8182

8283
:::image type="content" source="../../media/how-to/global-batch/upload-file.png" alt-text="Screenshot that shows upload file experience." lightbox="../../media/how-to/global-batch/upload-file.png":::
8384

85+
8486
## Create batch job
8587

8688
Select **Create** to start your batch job.
132 KB
Loading
94.3 KB
Loading
96.1 KB
Loading
101 KB
Loading
67.5 KB
Loading

0 commit comments

Comments
 (0)