Skip to content

Commit b64ad45

Browse files
authored
feat(ai-gateway): Add how-to and samples for Azure AI /batches and /files processing (#3281)
1 parent 167607d commit b64ad45

File tree

10 files changed

+598
-28
lines changed

10 files changed

+598
-28
lines changed

app/_how-tos/azure-batches.md

Lines changed: 337 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,337 @@
1+
---
2+
title: Send batch requests to Azure OpenAI LLMs
3+
content_type: how_to
4+
related_resources:
5+
- text: AI Gateway
6+
url: /ai-gateway/
7+
- text: AI Proxy
8+
url: /plugins/ai-proxy/
9+
10+
description: Reduce costs by using llm/v1/files and llm/v1/batches route_types to send asynchronous batched requests to Azure OpenAI.
11+
12+
products:
13+
- gateway
14+
- ai-gateway
15+
16+
works_on:
17+
- on-prem
18+
- konnect
19+
20+
min_version:
21+
gateway: '3.11'
22+
23+
plugins:
24+
- ai-proxy
25+
26+
entities:
27+
- service
28+
- route
29+
- plugin
30+
31+
tags:
32+
- ai
33+
- azure
34+
35+
tldr:
36+
q: How can I run many Azure OpenAI LLM requests at once?
37+
a: |
38+
Package your prompts into a JSONL file and upload it to the `/files` endpoint. Then launch a batch job with `/batches` to process everything asynchronously, and download the output from /files once the run completes.
39+
40+
tools:
41+
- deck
42+
43+
prereqs:
44+
inline:
45+
- title: Azure OpenAI
46+
icon_url: /assets/icons/azure.svg
47+
content: |
48+
This tutorial uses Azure OpenAI service. Configure it as follows:
49+
50+
1. [Create an Azure account](https://azure.microsoft.com/en-us/get-started/azure-portal).
51+
2. In the Azure Portal, click **Create a resource**.
52+
3. Search for **Azure OpenAI** and select **Azure OpenAI Service**.
53+
4. Configure your Azure resource.
54+
5. Export your instance name:
55+
```bash
56+
export DECK_AZURE_INSTANCE_NAME='YOUR_AZURE_RESOURCE_NAME'
57+
```
58+
6. Deploy your model in [Azure AI Foundry](https://ai.azure.com/):
59+
1. Go to **My assets → Models and deployments → Deploy model**.
60+
61+
{:.warning}
62+
> Use a `globalbatch` or `datazonebatch` deployment type for batch operations since standard deployments (`GlobalStandard`) cannot process batch files.
63+
64+
2. Export the API key and deployment ID:
65+
```bash
66+
export DECK_AZURE_OPENAI_API_KEY='YOUR_AZURE_OPENAI_MODEL_API_KEY'
67+
export DECK_AZURE_DEPLOYMENT_ID='YOUR_AZURE_OPENAI_DEPLOYMENT_NAME'```
68+
- title: Batch .jsonl file
69+
content: |
70+
To complete this tutorial, create a `batch.jsonl` to generate asynchronous batched LLM responses. We use `/v1/chat/completions` because it handles chat-based generation requests, instructing the LLM to produce conversational completions in batch mode.
71+
72+
Run the following command to create the file:
73+
74+
```bash
75+
cat <<EOF > batch.jsonl
76+
{"custom_id": "prod1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a compelling product description for a solar-powered smart garden light."}], "max_tokens": 60}}
77+
{"custom_id": "prod2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a product description for an energy-efficient smart thermostat for home use."}], "max_tokens": 60}}
78+
{"custom_id": "prod3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write an engaging product description for a biodegradable bamboo kitchen utensil set."}], "max_tokens": 60}}
79+
{"custom_id": "prod4", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a detailed product description for a water-saving smart shower head."}], "max_tokens": 60}}
80+
{"custom_id": "prod5", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a concise product description for a compact indoor air purifier that uses natural filters."}], "max_tokens": 60}}
81+
EOF
82+
83+
```
84+
entities:
85+
services:
86+
- files-service
87+
- batches-service
88+
routes:
89+
- files-route
90+
- batches-route
91+
92+
cleanup:
93+
inline:
94+
- title: Clean up Konnect environment
95+
include_content: cleanup/platform/konnect
96+
icon_url: /assets/icons/gateway.svg
97+
- title: Destroy the {{site.base_gateway}} container
98+
include_content: cleanup/products/gateway
99+
icon_url: /assets/icons/gateway.svg
100+
101+
automated_tests: false
102+
---
103+
## Configure AI Proxy plugins for /files route
104+
105+
Let's create an AI Proxy plugin for the `llm/v1/files` route type. It will be used to handle the upload and retrieval of JSONL files containing batch input and output data. This plugin instance ensures that input data is correctly staged for batch processing and that the results can be downloaded once the batch job completes.
106+
107+
{% entity_examples %}
108+
entities:
109+
plugins:
110+
- name: ai-proxy
111+
service: files-service
112+
config:
113+
model_name_header: false
114+
route_type: llm/v1/files
115+
auth:
116+
header_name: Authorization
117+
header_value: Bearer ${azure_key}
118+
model:
119+
provider: azure
120+
options:
121+
azure_api_version: "2025-01-01-preview"
122+
azure_instance: ${azure_instance}
123+
azure_deployment_id: ${azure_deployment}
124+
variables:
125+
azure_key:
126+
value: "$AZURE_OPENAI_API_KEY"
127+
azure_instance:
128+
value: "$AZURE_INSTANCE_NAME"
129+
azure_deployment:
130+
value: "$AZURE_DEPLOYMENT_ID"
131+
{% endentity_examples %}
132+
133+
## Configure AI Proxy plugins for /batches route
134+
135+
Next, create an AI Proxy plugin for the `llm/v1/batches` route. This plugin manages the submission, monitoring, and retrieval of asynchronous batch jobs. It communicates with Azure OpenAI's batch deployment to process multiple LLM requests in a batch.
136+
137+
{% entity_examples %}
138+
entities:
139+
plugins:
140+
- name: ai-proxy
141+
service: batches-service
142+
config:
143+
model_name_header: false
144+
route_type: llm/v1/batches
145+
auth:
146+
header_name: Authorization
147+
header_value: Bearer ${azure_key}
148+
model:
149+
provider: azure
150+
options:
151+
azure_api_version: "2025-01-01-preview"
152+
azure_instance: ${azure_instance}
153+
azure_deployment_id: ${azure_deployment}
154+
variables:
155+
azure_key:
156+
value: "$AZURE_OPENAI_API_KEY"
157+
azure_instance:
158+
value: "$AZURE_INSTANCE_NAME"
159+
azure_deployment:
160+
value: "$AZURE_DEPLOYMENT_ID"
161+
{% endentity_examples %}
162+
163+
## Upload a .jsonl file for batching
164+
165+
Now, let's use the following command to upload our [batching file](/#batch-jsonl-file) to the `/llm/v1/files` route:
166+
167+
<!-- vale off -->
168+
{% validation request-check %}
169+
url: "/files"
170+
status_code: 200
171+
form_data:
172+
purpose: "batch"
173+
file: "@batch.jsonl"
174+
extract_body:
175+
- name: 'id'
176+
variable: FILE_ID
177+
{% endvalidation %}
178+
<!-- vale on -->
179+
180+
Once processed, you will see a JSON response like this:
181+
182+
```json
183+
{
184+
"status": "processed",
185+
"bytes": 1648,
186+
"purpose": "batch",
187+
"filename": "batch.jsonl",
188+
"id": "file-da4364d8fd714dd9b29706b91236ab02",
189+
"created_at": 1761817541,
190+
"object": "file"
191+
}
192+
```
193+
194+
Now, let's export the file ID:
195+
196+
```bash
197+
export FILE_ID=YOUR_FILE_ID
198+
```
199+
200+
## Create a batching request
201+
202+
Now, we can send a `POST` request to the `/batches` Route to create a batch using our uploaded file:
203+
204+
{:.info}
205+
> The completion window must be set to `24h`, as it's the only value currently supported by the [OpenAI `/batches` API](https://platform.openai.com/docs/api-reference/batch/create).
206+
>
207+
> In this example we use the `/v1/chat/completions` route for batching because we are sending multiple structured chat-style prompts in OpenAI's chat completions format to be processed in bulk.
208+
209+
<!-- vale off -->
210+
{% validation request-check %}
211+
url: '/batches'
212+
status_code: 201
213+
body:
214+
input_file_id: $FILE_ID
215+
endpoint: "/v1/chat/completions"
216+
completion_window: "24h"
217+
extract_body:
218+
- name: 'id'
219+
variable: BATCH_ID
220+
{% endvalidation %}
221+
<!-- vale on -->
222+
223+
You will receive a response similar to:
224+
225+
```json
226+
{
227+
"cancelled_at": null,
228+
"cancelling_at": null,
229+
"completed_at": null,
230+
"completion_window": "24h",
231+
"created_at": 1761817562,
232+
"error_file_id": "",
233+
"expired_at": null,
234+
"expires_at": 1761903959,
235+
"failed_at": null,
236+
"finalizing_at": null,
237+
"id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
238+
"in_progress_at": null,
239+
"input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
240+
"errors": null,
241+
"metadata": null,
242+
"object": "batch",
243+
"output_file_id": "",
244+
"request_counts": {
245+
"total": 0,
246+
"completed": 0,
247+
"failed": 0
248+
},
249+
"status": "validating",
250+
"endpoint": ""
251+
}
252+
```
253+
{:.no-copy-code}
254+
255+
256+
Copy the batch ID from this response to check the batch status and export it as an environment variable by running the following command in your terminal:
257+
258+
```bash
259+
export BATCH_ID=YOUR_BATCH_ID
260+
```
261+
262+
## Check batching status
263+
264+
Wait for a moment for the batching request to be completed, then check the status of your batch by sending the following request:
265+
266+
<!-- vale off -->
267+
{% validation request-check %}
268+
url: /batches/$BATCH_ID
269+
extract_body:
270+
- name: 'output_file_id'
271+
variable: OUTPUT_FILE_ID
272+
{% endvalidation %}
273+
<!-- vale on -->
274+
275+
A completed batch response looks like this:
276+
277+
```json
278+
{
279+
"cancelled_at": null,
280+
"cancelling_at": null,
281+
"completed_at": 1761817685,
282+
"completion_window": "24h",
283+
"created_at": 1761817562,
284+
"error_file_id": null,
285+
"expired_at": null,
286+
"expires_at": 1761903959,
287+
"failed_at": null,
288+
"finalizing_at": 1761817662,
289+
"id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
290+
"in_progress_at": null,
291+
"input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
292+
"errors": null,
293+
"metadata": null,
294+
"object": "batch",
295+
"output_file_id": "file-93d91f55-0418-abcd-1234-81f4bb334951",
296+
"request_counts": {
297+
"total": 5,
298+
"completed": 5,
299+
"failed": 0
300+
},
301+
"status": "completed",
302+
"endpoint": "/v1/chat/completions"
303+
}
304+
```
305+
{:.no-copy-code}
306+
307+
You can notice The `"request_counts"` object shows that all five requests in the batch were successfully completed (`"completed": 5`, `"failed": 0`).
308+
309+
310+
Now, you can copy the `output_file_id` to retrieve your batched responses and export it as environment variable:
311+
312+
```bash
313+
export OUTPUT_FILE_ID=YOUR_OUTPUT_FILE_ID
314+
```
315+
316+
The output file ID will only be available once the batch request has completed. If the status is `"in_progress"`, it won’t be set yet.
317+
318+
## Retrieve batched responses
319+
320+
Now, we can download the batched responses from the `/files` endpoint by appending `/content` to the file ID URL. For details, see the [OpenAI API documentation](https://platform.openai.com/docs/api-reference/files/retrieve-contents).
321+
322+
```bash
323+
curl http://localhost:8000/files/$OUTPUT_FILE_ID/content > batched-response.jsonl
324+
```
325+
326+
This command saves the batched responses to the `batched-response.jsonl` file.
327+
328+
The batched response file contains one JSON object per line, each representing a single batched request's response. Here is an example of content from `batched-response.jsonl` which contains the individual completion results for each request we submitted in the batch input file:
329+
330+
331+
```json
332+
{"custom_id": "prod4", "response": {"body": {"id": "chatcmpl-AB12CD34EF56GH78IJ90KL12MN", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoFlow Smart Shower Head: Revolutionize Your Daily Routine While Saving Water**\n\nExperience the perfect blend of luxury, sustainability, and smart technology with the **EcoFlow Smart Shower Head** — a cutting-edge solution for modern households looking to conserve water without compromising on comfort. Designed to elevate your shower experience", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-111aaa22-bb33-cc44-dd55-ee66ff778899", "status_code": 200}, "error": null}
333+
{"custom_id": "prod3", "response": {"body": {"id": "chatcmpl-ZX98YW76VU54TS32RQ10PO98LK", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Eco-Friendly Elegance: Biodegradable Bamboo Kitchen Utensil Set**\n\nElevate your cooking experience while making a positive impact on the planet with our **Biodegradable Bamboo Kitchen Utensil Set**. Crafted from 100% natural, sustainably sourced bamboo, this set combines durability, functionality", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-222bbb33-cc44-dd55-ee66-ff7788990011", "status_code": 200}, "error": null}
334+
{"custom_id": "prod1", "response": {"body": {"id": "chatcmpl-MN34OP56QR78ST90UV12WX34YZ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Illuminate Your Garden with Brilliance: The Solar-Powered Smart Garden Light** \n\nTransform your outdoor space into a haven of sustainable beauty with the **Solar-Powered Smart Garden Light**—a perfect blend of modern innovation and eco-friendly design. Powered entirely by the sun, this smart light delivers effortless", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-333ccc44-dd55-ee66-ff77-889900112233", "status_code": 200}, "error": null}
335+
{"custom_id": "prod5", "response": {"body": {"id": "chatcmpl-AQ12WS34ED56RF78TG90HY12UJ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "Breathe easy with our compact indoor air purifier, designed to deliver fresh and clean air using natural filters. This eco-friendly purifier quietly removes allergens, dust, and odors without synthetic materials, making it perfect for any small space. Stylish, efficient, and sustainable—experience pure air, naturally.", "refusal": null, "annotations": []}, "finish_reason": "stop", "logprobs": null}], "usage": {"completion_tokens": 59, "prompt_tokens": 33, "total_tokens": 92}, "system_fingerprint": "fp_random1234"},"request_id": "req-444ddd55-ee66-ff77-8899-001122334455", "status_code": 200}, "error": null}
336+
{"custom_id": "prod2", "response": {"body": {"id": "chatcmpl-PO98LK76JI54HG32FE10DC98VB", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoSmart Pro Wi-Fi Thermostat: Energy Efficiency Meets Smart Technology** \n\nUpgrade your home’s comfort and save energy with the EcoSmart Pro Wi-Fi Thermostat. Designed for modern living, this sleek and intuitive thermostat lets you take control of your heating and cooling while minimizing energy waste. Whether you're", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-555eee66-ff77-8899-0011-223344556677", "status_code": 200}, "error": null}
337+
```
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
{% if page.works_on contains 'konnect' %}
22
<div data-deployment-topology="konnect" markdown="1" {% unless config.skip == true %} data-test-step="{{ config.data_validate_konnect | escape }}" {% endunless %}>
3-
{% include how-tos/validations/request-check/snippet.md url=config.konnect_url headers=config.headers body=config.body body_cmd=config.body_cmd method=config.method user=config.user sleep=config.sleep display_headers=config.display_headers cookie_jar=config.cookie_jar cookie=config.cookie message=config.message mtls=config.mtls count=config.count insecure=config.insecure expected_headers=config.expected_headers %}
3+
{% include how-tos/validations/request-check/snippet.md url=config.konnect_url headers=config.headers form_data=config.form_data body=config.body body_cmd=config.body_cmd method=config.method user=config.user sleep=config.sleep display_headers=config.display_headers cookie_jar=config.cookie_jar cookie=config.cookie message=config.message mtls=config.mtls count=config.count insecure=config.insecure expected_headers=config.expected_headers %}
44
</div>
55
{% endif %}
66

77
{% if page.works_on contains 'on-prem' %}
88
<div data-deployment-topology="on-prem" markdown="1" {% unless config.skip == true %} data-test-step="{{ config.data_validate_on_prem | escape }}" {% endunless %}>
9-
{% include how-tos/validations/request-check/snippet.md url=config.on_prem_url headers=config.headers body=config.body body_cmd=config.body_cmd method=config.method user=config.user sleep=config.sleep display_headers=config.display_headers cookie_jar=config.cookie_jar cookie=config.cookie message=config.message mtls=config.mtls count=config.count insecure=config.insecure expected_headers=config.expected_headers %}
9+
{% include how-tos/validations/request-check/snippet.md url=config.on_prem_url headers=config.headers form_data=config.form_data body=config.body body_cmd=config.body_cmd method=config.method user=config.user sleep=config.sleep display_headers=config.display_headers cookie_jar=config.cookie_jar cookie=config.cookie message=config.message mtls=config.mtls count=config.count insecure=config.insecure expected_headers=config.expected_headers %}
1010
</div>
1111
{% endif %}

app/_includes/how-tos/validations/request-check/snippet.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@
99
```bash
1010
{% if include.capture -%}
1111
{{include.capture}}=$({% endif %}{% if include.sleep %}sleep {{include.sleep}} && {% endif %}{% if count > 1%}for _ in {1..{{count}}}; do
12-
{% endif %} curl {% if include.insecure %}-k {% endif %}{% if include.display_headers %}-i {% endif %}{% if include.method %}-X {{include.method}} {% endif %}{% if include.mtls%}-k --key key.pem --cert cert.pem {% endif %}"{% if is_https %}https://{% endif %}{{ include.url }}" \
13-
--no-progress-meter --fail-with-body {% if include.headers %} \{%- endif -%}{% for header in include.headers %}
12+
{% endif %}curl {% if include.insecure %}-k {% endif %}{% if include.display_headers %}-i {% endif %}{% if include.method %}-X {{include.method}} {% endif %}{% if include.mtls%}-k --key key.pem --cert cert.pem {% endif %}"{% if is_https %}https://{% endif %}{{ include.url }}"{% if include.headers %} \{%- endif -%}{% for header in include.headers %}
1413
-H "{{header}}" {%- unless forloop.last -%} \{% endunless %}{%- endfor %}{% if include.user %} \
1514
-u {{include.user}}{%- endif %}{% if include.cookie_jar %} \
1615
--cookie-jar {{include.cookie_jar}}{%- endif %}{% if include.cookie %} \
17-
--cookie {{include.cookie}}{%- endif %}{% if include.body %} \
16+
--cookie {{include.cookie}}{%- endif %}{% if include.form_data %} \{% for data in include.form_data %}
17+
-F {{data[0]}}="{{data[1]}}" {% unless forloop.last -%} \{% endunless %}{%- endfor %}{% endif %}{% if include.body %} \
1818
--json '{{ include.body | json_prettify: 1 | escape_env_variables | indent: 4 | strip }}'{% elsif include.body_cmd %} \
1919
--json "{{ include.body_cmd }}"{% endif %}{% if include.jq %} | jq -r "{{ include.jq | strip }}"{% endif %}{% if include.capture -%}
2020
){% endif -%}

0 commit comments

Comments
 (0)