You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you prefer to work directly with the REST API instead of a Cloudflare Worker, below are the steps on how to do it:
10
+
If you prefer to work directly with the REST API instead of a [Cloudflare Worker](/workers-ai/features/batch-api/get-started/), below are the steps on how to do it:
Copy file name to clipboardExpand all lines: src/content/docs/workers-ai/features/batch-api/get-started.mdx
+8-11Lines changed: 8 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ If you want to skip the steps and get started quickly, click the button below:
11
11
12
12
[](https://deploy.workers.cloudflare.com/?url=https://github.com/craigsdennis/batch-please-workers-ai)
13
13
14
-
This will create a repository in your GitHub account and deploy a ready-to-use Worker that demonstrates how to use Cloudflare's Async Batch API. The template includes preconfigured AI bindings, and examples for sending and retrieving batch requests with and without external references. Once deployed, you can visit the live Worker and start experimenting with the Batch API immediately.
14
+
This will create a repository in your GitHub account and deploy a ready-to-use Worker that demonstrates how to use Cloudflare's Asynchronous Batch API. The template includes preconfigured AI bindings, and examples for sending and retrieving batch requests with and without external references. Once deployed, you can visit the live Worker and start experimenting with the Batch API immediately.
15
15
16
16
## 1. Prerequisites and setup
17
17
@@ -65,7 +65,7 @@ Your binding is [available in your Worker code](/workers/reference/migrate-to-mo
65
65
66
66
## 4. How to use the Batch API
67
67
68
-
### 1. Sending a Batch request
68
+
### Sending a Batch request
69
69
70
70
Send your initial batch inference request by composing a JSON payload containing an array of individual inference requests.
71
71
@@ -107,11 +107,9 @@ const resp = env.AI.run(
107
107
);
108
108
```
109
109
110
-
#### Expected Response
111
-
112
110
After sending your batch request, you will receive a response similar to:
113
111
114
-
```json
112
+
```json output
115
113
{
116
114
"status": "queued",
117
115
"request_id": "000-000-000",
@@ -123,11 +121,11 @@ After sending your batch request, you will receive a response similar to:
123
121
-**`request_id`**: A unique identifier for the batch request.
124
122
-**`model`**: The model used for the batch inference.
125
123
126
-
### 2. Polling the Batch Request Status
124
+
### Polling the Batch Request Status
127
125
128
126
Once your batch request is queued, use the `request_id` to poll for its status. During processing, the API returns a status "queued" or "running" indicating that the request is still in the queue or being processed.
129
127
130
-
```javascript title=example
128
+
```typescript title=example
131
129
// Polling the status of the batch request using the request_id
132
130
const status =env.AI.run("@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast", {
133
131
request_id: "000-000-000",
@@ -141,7 +139,7 @@ const status = env.AI.run("@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast", {
141
139
}
142
140
```
143
141
144
-
### 3. Retrieving the Batch Inference results
142
+
### Retrieving the Batch Inference results
145
143
146
144
When the inference is complete, the API returns a final HTTP status code of `200` along with an array of responses. Each response object corresponds to an individual input prompt, identified by an `id` that maps to the index of the prompt in your original request.
147
145
@@ -190,7 +188,7 @@ When the inference is complete, the API returns a final HTTP status code of `200
190
188
-**`success`**: A Boolean flag indicating if the request was processed successfully.
191
189
-**`usage`**: Contains token usage details for the batch request.
192
190
193
-
## 6. Implementing the Batch API in your Worker
191
+
## 5. Implementing the Batch API in your Worker
194
192
195
193
Below is a sample TypeScript Worker that receives a batch of inference requests, sends them to a batch-enabled AI model, and returns the results.
196
194
@@ -250,7 +248,6 @@ export default {
250
248
};
251
249
```
252
250
253
-
254
251
-**Receiving the Batch request:**
255
252
The Worker expects a `POST` request with a `JSON` payload containing an array called `requests`. Each prompt is an individual inference request.
256
253
@@ -260,7 +257,7 @@ export default {
260
257
-**Returning the results:**
261
258
Once processed, the AI API returns the batch responses. These responses include an array where each object has an `id` (matching the prompt index) and the corresponding inference result.
262
259
263
-
## 7. Deployment
260
+
## 6. Deployment
264
261
265
262
After completing your changes, deploy your Worker with the following command:
This guide will walk you through the concepts behind asynchronous batch processing, explain why it matters, and show you how to create and deploy a Cloudflare Worker that leverages the [Batch API with the AI binding](/workers-ai/features/batch-api/get-started/), working with [REST API](/workers-ai/features/batch-api/batch-api-rest-api/) instead of a Cloudflare Worker and through the a template.
11
-
12
10
## What is Asynchronous Batch?
13
11
14
12
Asynchronous batch processing lets you send a collection (batch) of inference requests in a single call. Instead of expecting immediate responses for every request, the system queues them for processing and returns the results later.
15
13
16
14
When you send a batch request, the API immediately acknowledges receipt with a status like `queued` and provides a unique `request_id`. This ID is later used to poll for the final responses once the processing is complete.
17
15
16
+
You can use the Batch API by either creating and deploying a Cloudflare Worker that leverages the [Batch API with the AI binding](/workers-ai/features/batch-api/get-started/), using the [REST API](/workers-ai/features/batch-api/batch-api-rest-api/) directly or by starting from a template.
0 commit comments