Small updates

kodster28 · kodster28 · commit f5ca4bc45d6d · 2025-04-10T14:44:55.000-05:00
diff --git a/src/content/docs/workers-ai/features/batch-api/batch-api-rest-api.mdx b/src/content/docs/workers-ai/features/batch-api/batch-api-rest-api.mdx
@@ -14,7 +14,7 @@ If you prefer to work directly with the REST API instead of a [Cloudflare Worker
 Make a POST request to the following endpoint:
 
 <CURL
-	url="https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
+	url="https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
 	method="POST"
 	headers={{
 		Authorization: "<token>",
@@ -37,12 +37,14 @@ Make a POST request to the following endpoint:
 	}}
 />
 
+You can pass `external_reference` as a unique ID per-prompt that will be returned in the response.
+
 ## 2. Retrieving the Batch Response
 
 After receiving a `request_id` from your initial POST, you can poll for or retrieve the results with another POST request:
 
 <CURL
-	url="https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/ray-llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
+	url="https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/llama-3.3-70b-instruct-fp8-fast?queueRequest=true"
 	method="POST"
 	headers={{
 		Authorization: "<token>",
diff --git a/src/content/docs/workers-ai/features/batch-api/index.mdx b/src/content/docs/workers-ai/features/batch-api/index.mdx
@@ -11,6 +11,8 @@ import { Render, PackageManagers, WranglerConfig, CURL } from "~/components";
 
 Asynchronous batch processing lets you send a collection (batch) of inference requests in a single call. Instead of expecting immediate responses for every request, the system queues them for processing and returns the results later.
 
+Batch processing is useful for large workloads such as summarization or embeddings when there is no human interaction. Using the batch API will guarantee that your requests are fulfilled eventually, rather than erroring out if Cloudflare does have enough capacity at a given time.
+
 When you send a batch request, the API immediately acknowledges receipt with a status like `queued` and provides a unique `request_id`. This ID is later used to poll for the final responses once the processing is complete.
 
 You can use the Batch API by either creating and deploying a Cloudflare Worker that leverages the [Batch API with the AI binding](/workers-ai/features/batch-api/get-started/), using the [REST API](/workers-ai/features/batch-api/batch-api-rest-api/) directly or by starting from a [template](https://github.com/craigsdennis/batch-please-workers-ai).