Batch API Support for Async Workloads (up to 50% cost savings) #1597

SebConejo · 2026-04-17T01:17:29Z

SebConejo
Apr 17, 2026
Maintainer

Problem

Major LLM providers (OpenAI, Anthropic, Google) offer Batch APIs, a separate endpoint where you upload a file of requests (JSONL) and the provider processes them asynchronously, typically within 24 hours. In exchange for relaxed latency, batch requests are ~50% cheaper than standard API calls.

Today, Manifest routes requests in real-time, one at a time. There is no way to leverage batch endpoints through the router. Users who need batch processing are forced to bypass Manifest entirely and go direct to providers, losing the benefits of intelligent routing and cost optimization.

What Batch API Looks Like

Unlike standard request/response, batch is a multi-step workflow:

Prepare a JSONL file with multiple requests
Upload it to the provider's batch endpoint
The provider processes requests when capacity is available (no real-time guarantee)
Poll or get notified when results are ready
Download the results file

Typical use cases: evaluations, bulk classification, dataset labeling, content generation at scale, synthetic data, anything that doesn't need an instant response.

Why It Matters

50% cost reduction on supported providers is too significant to ignore
Users doing evals or data processing at scale are currently locked out of Manifest for these workloads
As one user put it: "I cannot do batch, which is half the price", the same applies to any routing layer that only supports synchronous calls

Proposal

Add batch mode support to Manifest, enabling the router to:

Accept a batch of requests and route them to the cheapest provider's batch endpoint
Handle the async lifecycle (submit → poll → return results)
Apply the same routing logic (cost, model availability) but factoring in batch pricing

This would make Manifest useful for both real-time and async workloads, covering a much larger share of LLM API spend.

👍 React if this would be useful for your workflow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch API Support for Async Workloads (up to 50% cost savings) #1597

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Batch API Support for Async Workloads (up to 50% cost savings) #1597

Uh oh!

Uh oh!

SebConejo Apr 17, 2026 Maintainer

Problem

What Batch API Looks Like

Why It Matters

Proposal

Replies: 0 comments

SebConejo
Apr 17, 2026
Maintainer