Skip to content

Conversation

vgaggia
Copy link

@vgaggia vgaggia commented Oct 15, 2025

Description

Adds support for Anthropic's Batch API with a new settings toggle, enabling 50% cost savings on API requests through asynchronous batch processing.

Changes

  • Added �nthropicBatchApi setting toggle in API Configuration
  • Implemented batch processing via client.messages.batches API
  • Added silent polling with 5s intervals (max 10 minutes timeout)
  • Displays user-friendly message about 50% cost savings when enabled
  • Maintains full compatibility with:
    • Prompt caching
    • Extended thinking
    • 1M context beta flag
    • All supported Claude models

Implementation Details

  • Uses proper batch request structure with custom_id tracking
  • Handles beta headers correctly (including 1M context)
  • Polls batch status until processing_status === 'ended'
  • Retrieves results via client.messages.batches.results()
  • Supports both ext and hinking content blocks

Testing

  • TypeScript build passes
  • Linting passes
  • Works in development mode
  • VSIX builds successfully

Documentation

Official Anthropic Batch API docs: https://docs.anthropic.com/en/api/creating-message-batches

Closes #8667


Important

Adds support for Anthropic's Batch API with a new setting for 50% cost savings through async batch processing.

  • Behavior:
    • Adds anthropicUseBatchApi setting in provider-settings.ts for 50% cost savings via async batch processing.
    • Implements batch processing in AnthropicHandler in anthropic.ts using client.messages.batches API.
    • Polls batch status every 5s, with a 10-minute timeout.
    • Displays message about cost savings when batch API is enabled.
    • Maintains compatibility with prompt caching, extended thinking, 1M context beta, and all Claude models.
  • Implementation:
    • Uses createBatchMessage() in anthropic.ts to handle batch job lifecycle.
    • Adjusts cost calculation in getModel() and createMessage() to apply 50% discount.
  • UI:
    • Adds checkbox for anthropicUseBatchApi in Anthropic.tsx.
    • Updates i18n strings in settings.json for new batch API settings.

This description was created by Ellipsis for ccabc68. You can customize this summary. It will automatically update as commits are pushed.

@vgaggia vgaggia requested review from cte, jr and mrubens as code owners October 15, 2025 15:00
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Oct 15, 2025
// Add 1M context beta flag if enabled for Claude Sonnet 4 and 4.5
if (
(modelId === "claude-sonnet-4-20250514" || modelId === "claude-sonnet-4-5") &&
this.options.anthropicBeta1MContext
) {
betas.push("context-1m-2025-08-07")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prompt caching beta header is missing when creating batch requests with prompt caching support. When supportsPromptCaching(modelId) returns true, the "prompt-caching-2024-07-31" beta should be added to the betas array (similar to line 128 in the streaming path). Without this header, prompt caching won't work correctly in batch mode even though cache breakpoints are being added to the messages.

Suggested change
// Add 1M context beta flag if enabled for Claude Sonnet 4 and 4.5
if (
(modelId === "claude-sonnet-4-20250514" || modelId === "claude-sonnet-4-5") &&
this.options.anthropicBeta1MContext
) {
betas.push("context-1m-2025-08-07")
}
// Add 1M context beta flag if enabled for Claude Sonnet 4 and 4.5
if (
(modelId === "claude-sonnet-4-20250514" || modelId === "claude-sonnet-4-5") &&
this.options.anthropicBeta1MContext
) {
betas.push("context-1m-2025-08-07")
}
// Add prompt caching beta if model supports it
if (this.supportsPromptCaching(modelId)) {
betas.push("prompt-caching-2024-07-31")
}

},
],
},
batchOptions as any,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as any type assertion bypasses TypeScript's type checking and could hide type mismatches. The Anthropic SDK's types should be used directly without casting. If the types don't match, either the SDK types need updating or the code needs adjustment to match the actual API contract.

Comment on lines 378 to 396
while (Date.now() - startTime < BATCH_MAX_POLL_TIME_MS) {
const status = await this.client.messages.batches.retrieve(batch.id)

if (status.processing_status === "ended") {
completedBatch = status
break
}

// Wait before next poll
await new Promise((resolve) => setTimeout(resolve, BATCH_POLL_INTERVAL_MS))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch status handling is incomplete. According to Anthropic's Batch API documentation, processing_status can be "in_progress", "canceling", "ended", "expired", or "canceled". Currently, only "ended" is checked, which means if a batch expires or is canceled, the code will wait until the timeout and throw a generic timeout error instead of providing a specific error message about the actual failure state.

Suggested change
while (Date.now() - startTime < BATCH_MAX_POLL_TIME_MS) {
const status = await this.client.messages.batches.retrieve(batch.id)
if (status.processing_status === "ended") {
completedBatch = status
break
}
// Wait before next poll
await new Promise((resolve) => setTimeout(resolve, BATCH_POLL_INTERVAL_MS))
}
while (Date.now() - startTime < BATCH_MAX_POLL_TIME_MS) {
const status = await this.client.messages.batches.retrieve(batch.id)
if (status.processing_status === "ended") {
completedBatch = status
break
}
// Handle failure states
if (status.processing_status === "expired") {
throw new Error("Batch request expired before completing")
}
if (status.processing_status === "canceled") {
throw new Error("Batch request was canceled")
}
// Wait before next poll
await new Promise((resolve) => setTimeout(resolve, BATCH_POLL_INTERVAL_MS))
}

Comment on lines +318 to +316
/**
* Creates a message using the Batch API for 50% cost savings.
* This method handles the async batch job lifecycle: create, poll, and retrieve results.
*/
private async *createBatchMessage(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new Batch API functionality lacks test coverage. Given the complexity of the batch processing lifecycle (polling, timeout handling, result retrieval) and the potential for errors at each stage, this code should have comprehensive tests. Consider adding tests that cover:

  • Successful batch processing with prompt caching enabled
  • Successful batch processing without prompt caching
  • Batch timeout scenarios
  • Batch expiration/cancellation scenarios
  • Error handling in batch results
  • Verification that the 50% cost discount is applied correctly
  • Verification that beta headers are included when needed

The existing anthropic.spec.ts provides a good pattern to follow for mocking the SDK's batch API methods.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 15, 2025
Comment on lines 444 to 447
} else if (result.result.type === "errored") {
const errorType = result.result.error.type
throw new Error(`Batch request failed: ${errorType}`)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling only includes the error type but omits the error message, making debugging difficult. Anthropic's batch API error responses include both type and message fields. Including the full error context would help users understand what went wrong with their batch request.

Suggested change
} else if (result.result.type === "errored") {
const errorType = result.result.error.type
throw new Error(`Batch request failed: ${errorType}`)
}
} else if (result.result.type === "errored") {
const error = result.result.error
throw new Error(`Batch request failed: ${error.type}${error.message ? ` - ${error.message}` : ""}`)
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, i think i will add this

Comment on lines 407 to 449
// Process results
for await (const result of results) {
if (result.result.type === "succeeded") {
const message = result.result.message

// Yield content blocks
for (const content of message.content) {
if (content.type === "text") {
yield { type: "text", text: content.text }
} else if (content.type === "thinking") {
yield { type: "reasoning", text: content.thinking }
}
}

// Yield usage information
const usage = message.usage
yield {
type: "usage",
inputTokens: usage.input_tokens || 0,
outputTokens: usage.output_tokens || 0,
cacheWriteTokens: usage.cache_creation_input_tokens || undefined,
cacheReadTokens: usage.cache_read_input_tokens || undefined,
}

// Calculate and yield cost
yield {
type: "usage",
inputTokens: 0,
outputTokens: 0,
totalCost: calculateApiCostAnthropic(
this.getModel().info,
usage.input_tokens || 0,
usage.output_tokens || 0,
usage.cache_creation_input_tokens || 0,
usage.cache_read_input_tokens || 0,
),
}
} else if (result.result.type === "errored") {
const errorType = result.result.error.type
throw new Error(`Batch request failed: ${errorType}`)
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch results processing doesn't validate that at least one successful result was returned. If the batch completes but all results are errors (or the results iterator is empty), the user would see the "Using Batch API" notification but receive no response text or usage information. This could happen if the batch request was malformed or if there were API-level issues. Consider tracking whether any successful result was processed and throwing an appropriate error if not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to do this.

{
requests: [
{
custom_id: `req_${Date.now()}`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom_id uses only Date.now() which could theoretically cause collisions if multiple batch requests are initiated in the same millisecond (e.g., in high-concurrency scenarios or automated testing). While unlikely in typical usage, a more robust approach would include additional entropy to guarantee uniqueness.

Suggested change
custom_id: `req_${Date.now()}`,
custom_id: `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`,

Comment on lines 272 to 275
// Apply 50% discount for Batch API (applies after 1M context pricing if both enabled)
if (this.options.anthropicUseBatchApi) {
info = {
...info,
inputPrice: typeof info.inputPrice === "number" ? info.inputPrice * 0.5 : undefined,
outputPrice: typeof info.outputPrice === "number" ? info.outputPrice * 0.5 : undefined,
cacheWritesPrice: typeof info.cacheWritesPrice === "number" ? info.cacheWritesPrice * 0.5 : undefined,
cacheReadsPrice: typeof info.cacheReadsPrice === "number" ? info.cacheReadsPrice * 0.5 : undefined,
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 50% batch API discount is applied in both getModel() (line 273-281) and in useSelectedModel.ts (line 398-409). While these serve different purposes (backend cost calculation vs UI pricing display), this duplication could lead to maintenance issues if the discount logic needs to change. Consider extracting the discount calculation into a shared utility function to ensure consistency and reduce duplication.


// Batch API polling configuration
const BATCH_POLL_INTERVAL_MS = 5000 // Poll every 5 seconds
const BATCH_MAX_POLL_TIME_MS = 600000 // Max 10 minutes polling
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout constant is set to 10 minutes (600,000ms), but the PR description states "max 5 minutes timeout". This discrepancy between code and documentation could confuse users about the actual timeout behavior. Either update the constant to match the documented 5 minutes (300000) or update the PR description to reflect the 10-minute timeout.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this in the PR comment, although i don't think its that big of an issue really

vgaggia added a commit to vgaggia/Roo-Code that referenced this pull request Oct 16, 2025
Addresses roomote bot feedback on PR RooCodeInc#8672:

- Enhanced error handling to include full error details via JSON.stringify()
  for better debugging when batch requests fail
- Extracted batch API discount calculation into shared applyBatchApiDiscount()
  utility function in src/shared/cost.ts to eliminate code duplication
  between backend (anthropic.ts) and frontend (useSelectedModel.ts)
- Added documentation comment explaining custom_id generation approach
  and future considerations for multiple requests per batch
Adds toggle to enable Anthropic's Batch API for async message processing with 50% cost reduction. Includes:
- Backend implementation with proper prompt caching support
- Beta header support for 1M context compatibility
- UI pricing display updates to show 50% discount
- Settings UI toggle with translations
Add translations for anthropicBatchApiLabel and anthropicBatchApiDescription
across all 17 supported languages (ca, de, es, fr, hi, id, it, ja, ko, nl, pl,
pt-BR, ru, tr, vi, zh-CN, zh-TW)
- Enhanced error messages with JSON.stringify() for full error context
- Extracted applyBatchApiDiscount() utility to shared cost.ts
- Added documentation for custom_id approach
- Fixed batch status handling to only fail on actual error states (errored/expired/canceled)
  instead of failing on any unknown transitional state
- Rebased onto main to resolve Claude Haiku 4.5 model conflict
@vgaggia vgaggia force-pushed the feat/anthropic-batch-api branch from 44e8d23 to 2e6e434 Compare October 16, 2025 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Add Anthropic Batch API support for 50% cost savings

2 participants