Skip to content

Commit c1ff1e2

Browse files
minor fixes
1 parent 870d2fa commit c1ff1e2

File tree

1 file changed

+14
-7
lines changed

1 file changed

+14
-7
lines changed

src/content/docs/workers-ai/features/async-batch-api.mdx

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -65,11 +65,17 @@ binding = "AI"
6565

6666
Your binding is [available in your Worker code](/workers/reference/migrate-to-module-workers/#bindings-in-es-modules-format) on [`env.AI`](/workers/runtime-apis/handlers/fetch/).
6767

68-
## 4. How to use the Batch API
68+
## 5. How to use the Batch API
6969

7070
### 1. Sending a Batch request
7171

72-
Send your initial batch inference request by composing a JSON payload containing an array of individual inference requests. Ensure that the total payload is under 25 MB.
72+
Send your initial batch inference request by composing a JSON payload containing an array of individual inference requests.
73+
74+
:::note[Note]
75+
76+
Ensure that the total payload is under 25 MB.
77+
78+
:::
7379

7480
```javascript title=Example code
7581
// Input: JSON with an array of individual request JSONs
@@ -183,7 +189,7 @@ When the inference is complete, the API returns a final HTTP status code of `200
183189
- **`success`**: A Boolean flag indicating if the request was processed successfully.
184190
- **`usage`**: Contains token usage details for the batch request.
185191

186-
## 5. Implementing the Batch API in your Worker
192+
## 6. Implementing the Batch API in your Worker
187193

188194
Below is a sample TypeScript Worker that receives a batch of inference requests, sends them to a batch-enabled AI model, and returns the results.
189195

@@ -263,9 +269,10 @@ If you prefer to work directly with the REST API instead of a Cloudflare Worker,
263269
Make a POST request to the following endpoint:
264270

265271
```bash
266-
POST https://api.cloudflare.com/client/v4/accounts/<account-id>/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true
267-
Authorization: <token>
268-
Content-Type: application/json
272+
curl --request POST \
273+
--url "https://api.cloudflare.com/client/v4/accounts/<account-id>/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
274+
--header "Authorization: <token>"
275+
--header "Content-Type: application/json"
269276
```
270277

271278
#### Request Payload Example
@@ -303,7 +310,7 @@ Content-Type: application/json
303310
}
304311
```
305312

306-
## 6. Deployment
313+
## 7. Deployment
307314

308315
After completing your changes, deploy your Worker with the following command:
309316

0 commit comments

Comments
 (0)