edits

daisyfaithauma · daisyfaithauma · commit b85bae94ad85 · 2025-04-10T17:14:29.000+01:00
diff --git a/src/content/docs/workers-ai/features/batch-api/batch-api-rest-api.mdx b/src/content/docs/workers-ai/features/batch-api/batch-api-rest-api.mdx
@@ -7,51 +7,51 @@ sidebar:
 
 import { Render, PackageManagers, WranglerConfig, CURL } from "~/components";
 
-If you prefer to work directly with the REST API instead of a Cloudflare Worker, below are the steps on how to do it:
+If you prefer to work directly with the REST API instead of a [Cloudflare Worker](/workers-ai/features/batch-api/get-started/), below are the steps on how to do it:
 
 ## 1. Sending a Batch Request
 
 Make a POST request to the following endpoint:
 
 <CURL
-	url="https://api.cloudflare.com/client/v4/accounts/<account-id>/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
+	url="https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
 	method="POST"
 	headers={{
 		Authorization: "<token>",
 		"Content-Type": "application/json",
 	}}
+	json={{
+		requests: [
+			{
+				prompt: "Tell me a story",
+				external_reference: "reference2",
+			},
+			{
+				prompt: "Tell me a joke",
+				external_reference: "reference1",
+			},
+		],
+	}}
+	code={{
+		mark: "external_reference",
+	}}
 />
 
-```json output
-{
-	"requests": [
-		{
-			"prompt": "Tell me a story",
-			"external_reference": "reference2"
-		},
-		{
-			"prompt": "Tell me a joke",
-			"external_reference": "reference1"
-		}
-	]
-}
-```
-
 ## 2. Retrieving the Batch Response
 
 After receiving a `request_id` from your initial POST, you can poll for or retrieve the results with another POST request:
 
 <CURL
-	url="https://api.cloudflare.com/client/v4/accounts/<account-id>/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast"
+	url="https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
 	method="POST"
 	headers={{
 		Authorization: "<token>",
 		"Content-Type": "application/json",
 	}}
+	json={{
+		request_id: "<uuid>",
+	}}
+	code={{
+		mark: "request_id",
+	}}
 />
-
-```json output
-{
-	"request_id": "<uuid>"
-}
-```
diff --git a/src/content/docs/workers-ai/features/batch-api/get-started.mdx b/src/content/docs/workers-ai/features/batch-api/get-started.mdx
@@ -11,7 +11,7 @@ If you want to skip the steps and get started quickly, click the button below:
 
 [![Deploy to Workers](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/craigsdennis/batch-please-workers-ai)
 
-This will create a repository in your GitHub account and deploy a ready-to-use Worker that demonstrates how to use Cloudflare's Async Batch API. The template includes preconfigured AI bindings, and examples for sending and retrieving batch requests with and without external references. Once deployed, you can visit the live Worker and start experimenting with the Batch API immediately.
+This will create a repository in your GitHub account and deploy a ready-to-use Worker that demonstrates how to use Cloudflare's Asynchronous Batch API. The template includes preconfigured AI bindings, and examples for sending and retrieving batch requests with and without external references. Once deployed, you can visit the live Worker and start experimenting with the Batch API immediately.
 
 ## 1. Prerequisites and setup
 
@@ -65,7 +65,7 @@ Your binding is [available in your Worker code](/workers/reference/migrate-to-mo
 
 ## 4. How to use the Batch API
 
-### 1. Sending a Batch request
+### Sending a Batch request
 
 Send your initial batch inference request by composing a JSON payload containing an array of individual inference requests.
 
@@ -107,11 +107,9 @@ const resp = env.AI.run(
 );
 ```
 
-#### Expected Response
-
 After sending your batch request, you will receive a response similar to:
 
-```json
+```json output
 {
 	"status": "queued",
 	"request_id": "000-000-000",
@@ -123,11 +121,11 @@ After sending your batch request, you will receive a response similar to:
 - **`request_id`**: A unique identifier for the batch request.
 - **`model`**: The model used for the batch inference.
 
-### 2. Polling the Batch Request Status
+### Polling the Batch Request Status
 
 Once your batch request is queued, use the `request_id` to poll for its status. During processing, the API returns a status "queued" or "running" indicating that the request is still in the queue or being processed.
 
-```javascript title=example
+```typescript title=example
 // Polling the status of the batch request using the request_id
 const status = env.AI.run("@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast", {
 	request_id: "000-000-000",
@@ -141,7 +139,7 @@ const status = env.AI.run("@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast", {
 }
 ```
 
-### 3. Retrieving the Batch Inference results
+### Retrieving the Batch Inference results
 
 When the inference is complete, the API returns a final HTTP status code of `200` along with an array of responses. Each response object corresponds to an individual input prompt, identified by an `id` that maps to the index of the prompt in your original request.
 
@@ -190,7 +188,7 @@ When the inference is complete, the API returns a final HTTP status code of `200
   - **`success`**: A Boolean flag indicating if the request was processed successfully.
 - **`usage`**: Contains token usage details for the batch request.
 
-## 6. Implementing the Batch API in your Worker
+## 5. Implementing the Batch API in your Worker
 
 Below is a sample TypeScript Worker that receives a batch of inference requests, sends them to a batch-enabled AI model, and returns the results.
 
@@ -250,7 +248,6 @@ export default {
 };
 ```
 
-
 - **Receiving the Batch request:**
   The Worker expects a `POST` request with a `JSON` payload containing an array called `requests`. Each prompt is an individual inference request.
 
@@ -260,7 +257,7 @@ export default {
 - **Returning the results:**
   Once processed, the AI API returns the batch responses. These responses include an array where each object has an `id` (matching the prompt index) and the corresponding inference result.
 
-## 7. Deployment
+## 6. Deployment
 
 After completing your changes, deploy your Worker with the following command:
 
diff --git a/src/content/docs/workers-ai/features/batch-api/index.mdx b/src/content/docs/workers-ai/features/batch-api/index.mdx
@@ -7,14 +7,14 @@ sidebar:
 
 import { Render, PackageManagers, WranglerConfig, CURL } from "~/components";
 
-This guide will walk you through the concepts behind asynchronous batch processing, explain why it matters, and show you how to create and deploy a Cloudflare Worker that leverages the [Batch API with the AI binding](/workers-ai/features/batch-api/get-started/), working with [REST API](/workers-ai/features/batch-api/batch-api-rest-api/) instead of a Cloudflare Worker and through the a template.
-
 ## What is Asynchronous Batch?
 
 Asynchronous batch processing lets you send a collection (batch) of inference requests in a single call. Instead of expecting immediate responses for every request, the system queues them for processing and returns the results later.
 
 When you send a batch request, the API immediately acknowledges receipt with a status like `queued` and provides a unique `request_id`. This ID is later used to poll for the final responses once the processing is complete.
 
+You can use the Batch API by either creating and deploying a Cloudflare Worker that leverages the [Batch API with the AI binding](/workers-ai/features/batch-api/get-started/), using the [REST API](/workers-ai/features/batch-api/batch-api-rest-api/) directly or by starting from a template.
+
 :::note[Note]
 
 Ensure that the total payload is under 10 MB.