feat: add Anthropic Batch API support for 50% cost savings #8672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

vgaggia wants to merge 4 commits into RooCodeInc:main from vgaggia:feat/anthropic-batch-api

vgaggia commented Oct 15, 2025 •

edited

Loading

Description

Adds support for Anthropic's Batch API with a new settings toggle, enabling 50% cost savings on API requests through asynchronous batch processing.

Changes

Added �nthropicBatchApi setting toggle in API Configuration
Implemented batch processing via client.messages.batches API
Added silent polling with 5s intervals (max 10 minutes timeout)
Displays user-friendly message about 50% cost savings when enabled
Maintains full compatibility with:
- Prompt caching
- Extended thinking
- 1M context beta flag
- All supported Claude models

Implementation Details

Uses proper batch request structure with custom_id tracking
Handles beta headers correctly (including 1M context)
Polls batch status until processing_status === 'ended'
Retrieves results via client.messages.batches.results()
Supports both ext and hinking content blocks

Testing

TypeScript build passes
Linting passes
Works in development mode
VSIX builds successfully

Documentation

Official Anthropic Batch API docs: https://docs.anthropic.com/en/api/creating-message-batches

Closes #8667

Important

Adds support for Anthropic's Batch API with a new setting for 50% cost savings through async batch processing.

Behavior:
- Adds anthropicUseBatchApi setting in provider-settings.ts for 50% cost savings via async batch processing.
- Implements batch processing in AnthropicHandler in anthropic.ts using client.messages.batches API.
- Polls batch status every 5s, with a 10-minute timeout.
- Displays message about cost savings when batch API is enabled.
- Maintains compatibility with prompt caching, extended thinking, 1M context beta, and all Claude models.
Implementation:
- Uses createBatchMessage() in anthropic.ts to handle batch job lifecycle.
- Adjusts cost calculation in getModel() and createMessage() to apply 50% discount.
UI:
- Adds checkbox for anthropicUseBatchApi in Anthropic.tsx.
- Updates i18n strings in settings.json for new batch API settings.

^{This description was created by}^{for ccabc686001dd7b2bba28846eb54def2b9509217. You can customize this summary. It will automatically update as commits are pushed.}

vgaggia requested review from cte, jr and mrubens as code owners

October 15, 2025 15:00

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap

github-project-automation bot moved this to Triage in Roo Code Roadmap

github-project-automation bot moved this to New in Roo Code Roadmap

dosubot bot added size:L enhancement labels

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

    
              		// Add 1M context beta flag if enabled for Claude Sonnet 4 and 4.5

              		if (

              			(modelId === "claude-sonnet-4-20250514" || modelId === "claude-sonnet-4-5") &&

              			this.options.anthropicBeta1MContext

              		) {

              			betas.push("context-1m-2025-08-07")

              		}

Contributor

roomote bot Oct 15, 2025

The prompt caching beta header is missing when creating batch requests with prompt caching support. When supportsPromptCaching(modelId) returns true, the "prompt-caching-2024-07-31" beta should be added to the betas array (similar to line 128 in the streaming path). Without this header, prompt caching won't work correctly in batch mode even though cache breakpoints are being added to the messages.

Suggested change

      
            		// Add 1M context beta flag if enabled for Claude Sonnet 4 and 4.5
          
            		if (
          
            			(modelId === "claude-sonnet-4-20250514" || modelId === "claude-sonnet-4-5") &&
          
            			this.options.anthropicBeta1MContext
          
            		) {
          
            			betas.push("context-1m-2025-08-07")
          
            		}
          
            		// Add 1M context beta flag if enabled for Claude Sonnet 4 and 4.5
          
            		if (
          
            			(modelId === "claude-sonnet-4-20250514" || modelId === "claude-sonnet-4-5") &&
          
            			this.options.anthropicBeta1MContext
          
            		) {
          
            			betas.push("context-1m-2025-08-07")
          
            		}
          
            		// Add prompt caching beta if model supports it
          
            		if (this.supportsPromptCaching(modelId)) {
          
            			betas.push("prompt-caching-2024-07-31")
          
            		}

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts Outdated

    
              					},

              				],

              			},

              			batchOptions as any,

Contributor

roomote bot Oct 15, 2025

The as any type assertion bypasses TypeScript's type checking and could hide type mismatches. The Anthropic SDK's types should be used directly without casting. If the types don't match, either the SDK types need updating or the code needs adjustment to match the actual API contract.

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

Comment on lines 378 to 396

    
              		while (Date.now() - startTime < BATCH_MAX_POLL_TIME_MS) {

              			const status = await this.client.messages.batches.retrieve(batch.id)

              			if (status.processing_status === "ended") {

              				completedBatch = status

              				break

              			}

              			// Wait before next poll

              			await new Promise((resolve) => setTimeout(resolve, BATCH_POLL_INTERVAL_MS))

              		}

Contributor

roomote bot Oct 15, 2025

The batch status handling is incomplete. According to Anthropic's Batch API documentation, processing_status can be "in_progress", "canceling", "ended", "expired", or "canceled". Currently, only "ended" is checked, which means if a batch expires or is canceled, the code will wait until the timeout and throw a generic timeout error instead of providing a specific error message about the actual failure state.

Suggested change

      
            		while (Date.now() - startTime < BATCH_MAX_POLL_TIME_MS) {
          
            			const status = await this.client.messages.batches.retrieve(batch.id)
          
            			if (status.processing_status === "ended") {
          
            				completedBatch = status
          
            				break
          
            			}
          
            			// Wait before next poll
          
            			await new Promise((resolve) => setTimeout(resolve, BATCH_POLL_INTERVAL_MS))
          
            		}
          
            		while (Date.now() - startTime < BATCH_MAX_POLL_TIME_MS) {
          
            			const status = await this.client.messages.batches.retrieve(batch.id)
          
            			if (status.processing_status === "ended") {
          
            				completedBatch = status
          
            				break
          
            			}
          
            			// Handle failure states
          
            			if (status.processing_status === "expired") {
          
            				throw new Error("Batch request expired before completing")
          
            			}
          
            			if (status.processing_status === "canceled") {
          
            				throw new Error("Batch request was canceled")
          
            			}
          
            			// Wait before next poll
          
            			await new Promise((resolve) => setTimeout(resolve, BATCH_POLL_INTERVAL_MS))
          
            		}

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

Comment on lines +318 to +316

    
              	/**

              	 * Creates a message using the Batch API for 50% cost savings.

              	 * This method handles the async batch job lifecycle: create, poll, and retrieve results.

              	 */

              	private async *createBatchMessage(

Contributor

roomote bot Oct 15, 2025

The new Batch API functionality lacks test coverage. Given the complexity of the batch processing lifecycle (polling, timeout handling, result retrieval) and the potential for errors at each stage, this code should have comprehensive tests. Consider adding tests that cover:

Successful batch processing with prompt caching enabled
Successful batch processing without prompt caching
Batch timeout scenarios
Batch expiration/cancellation scenarios
Error handling in batch results
Verification that the 50% cost discount is applied correctly
Verification that beta headers are included when needed

The existing anthropic.spec.ts provides a good pattern to follow for mocking the SDK's batch API methods.

hannesrudolph added the Issue/PR - Triage label

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

Comment on lines 444 to 447

    
              			} else if (result.result.type === "errored") {

              				const errorType = result.result.error.type

              				throw new Error(`Batch request failed: ${errorType}`)

              			}

Contributor

roomote bot Oct 15, 2025

The error handling only includes the error type but omits the error message, making debugging difficult. Anthropic's batch API error responses include both type and message fields. Including the full error context would help users understand what went wrong with their batch request.

Suggested change

      
            			} else if (result.result.type === "errored") {
          
            				const errorType = result.result.error.type
          
            				throw new Error(`Batch request failed: ${errorType}`)
          
            			}
          
            			} else if (result.result.type === "errored") {
          
            				const error = result.result.error
          
            				throw new Error(`Batch request failed: ${error.type}${error.message ? ` - ${error.message}` : ""}`)
          
            			}

Author

vgaggia Oct 16, 2025

This makes sense, i think i will add this

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

Comment on lines 407 to 449

    
              		// Process results

              		for await (const result of results) {

              			if (result.result.type === "succeeded") {

              				const message = result.result.message

              				// Yield content blocks

              				for (const content of message.content) {

              					if (content.type === "text") {

              						yield { type: "text", text: content.text }

              					} else if (content.type === "thinking") {

              						yield { type: "reasoning", text: content.thinking }

              					}

              				}

              				// Yield usage information

              				const usage = message.usage

              				yield {

              					type: "usage",

              					inputTokens: usage.input_tokens || 0,

              					outputTokens: usage.output_tokens || 0,

              					cacheWriteTokens: usage.cache_creation_input_tokens || undefined,

              					cacheReadTokens: usage.cache_read_input_tokens || undefined,

              				}

              				// Calculate and yield cost

              				yield {

              					type: "usage",

              					inputTokens: 0,

              					outputTokens: 0,

              					totalCost: calculateApiCostAnthropic(

              						this.getModel().info,

              						usage.input_tokens || 0,

              						usage.output_tokens || 0,

              						usage.cache_creation_input_tokens || 0,

              						usage.cache_read_input_tokens || 0,

              					),

              				}

              			} else if (result.result.type === "errored") {

              				const errorType = result.result.error.type

              				throw new Error(`Batch request failed: ${errorType}`)

              			}

              		}

              	}

Contributor

roomote bot Oct 15, 2025

The batch results processing doesn't validate that at least one successful result was returned. If the batch completes but all results are errors (or the results iterator is empty), the user would see the "Using Batch API" notification but receive no response text or usage information. This could happen if the batch request was malformed or if there were API-level issues. Consider tracking whether any successful result was processed and throwing an appropriate error if not.

Author

vgaggia Oct 16, 2025

I don't think we need to do this.

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

    
              			{

              				requests: [

              					{

              						custom_id: `req_${Date.now()}`,

Contributor

roomote bot Oct 15, 2025

The custom_id uses only Date.now() which could theoretically cause collisions if multiple batch requests are initiated in the same millisecond (e.g., in high-concurrency scenarios or automated testing). While unlikely in typical usage, a more robust approach would include additional entropy to guarantee uniqueness.

Suggested change

      
            						custom_id: `req_${Date.now()}`,
          
            						custom_id: `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`,

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

Comment on lines 272 to 275

    
              		// Apply 50% discount for Batch API (applies after 1M context pricing if both enabled)

              		if (this.options.anthropicUseBatchApi) {

              			info = {

              				...info,

              				inputPrice: typeof info.inputPrice === "number" ? info.inputPrice * 0.5 : undefined,

              				outputPrice: typeof info.outputPrice === "number" ? info.outputPrice * 0.5 : undefined,

              				cacheWritesPrice: typeof info.cacheWritesPrice === "number" ? info.cacheWritesPrice * 0.5 : undefined,

              				cacheReadsPrice: typeof info.cacheReadsPrice === "number" ? info.cacheReadsPrice * 0.5 : undefined,

              			}

              		}

Contributor

roomote bot Oct 15, 2025

The 50% batch API discount is applied in both getModel() (line 273-281) and in useSelectedModel.ts (line 398-409). While these serve different purposes (backend cost calculation vs UI pricing display), this duplication could lead to maintenance issues if the discount logic needs to change. Consider extracting the discount calculation into a shared utility function to ensure consistency and reduce duplication.

roomote bot reviewed

View reviewed changes

src/api/providers/anthropic.ts

    
            @@ -19,6 +19,10 @@ import { BaseProvider } from "./base-provider"
          
              import type { SingleCompletionHandler, ApiHandlerCreateMessageMetadata } from "../index"

              import { calculateApiCostAnthropic } from "../../shared/cost"

              // Batch API polling configuration

              const BATCH_POLL_INTERVAL_MS = 5000 // Poll every 5 seconds

              const BATCH_MAX_POLL_TIME_MS = 600000 // Max 10 minutes polling

Contributor

roomote bot Oct 15, 2025

The timeout constant is set to 10 minutes (600,000ms), but the PR description states "max 5 minutes timeout". This discrepancy between code and documentation could confuse users about the actual timeout behavior. Either update the constant to match the documented 5 minutes (300000) or update the PR description to reflect the 10-minute timeout.

Author

vgaggia Oct 16, 2025

Changed this in the PR comment, although i don't think its that big of an issue really

vgaggia added 4 commits

October 16, 2025 10:52


          feat: add Anthropic Batch API support for 50% cost savings

22fdfc4

Adds toggle to enable Anthropic's Batch API for async message processing with 50% cost reduction. Includes:
- Backend implementation with proper prompt caching support
- Beta header support for 1M context compatibility
- UI pricing display updates to show 50% discount
- Settings UI toggle with translations


          fix: address roomote bot feedback - add prompt caching beta, remove t…

c768dd5

…ype cast, handle batch failure states


          chore: add AI-generated translations for Anthropic Batch API settings

e1bd453

Add translations for anthropicBatchApiLabel and anthropicBatchApiDescription
across all 17 supported languages (ca, de, es, fr, hi, id, it, ja, ko, nl, pl,
pt-BR, ru, tr, vi, zh-CN, zh-TW)


          fix: improve batch API error handling and reduce code duplication

2e6e434

- Enhanced error messages with JSON.stringify() for full error context
- Extracted applyBatchApiDiscount() utility to shared cost.ts
- Added documentation for custom_id approach
- Fixed batch status handling to only fail on actual error states (errored/expired/canceled)
  instead of failing on any unknown transitional state
- Rebased onto main to resolve Claude Haiku 4.5 model conflict

vgaggia force-pushed the feat/anthropic-batch-api branch from 44e8d23 to 2e6e434 Compare

October 16, 2025 09:06

Member

daniel-lxs commented Oct 29, 2025

Thanks for exploring this, really appreciate the effort. Unfortunately this approach isn’t viable - batch requests can take several minutes and don’t stream results, which will cause the already slow requests to feel like they are taking even longer. Also, hard-coding a 50% discount isn’t something we can do reliably; pricing needs to come from provider metadata. I'll be closing this PR for now.

daniel-lxs closed this

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap

github-project-automation bot moved this from New to Done in Roo Code Roadmap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

roomote[bot] roomote[bot] left review comments

mrubens Awaiting requested review from mrubens mrubens is a code owner

cte Awaiting requested review from cte cte is a code owner

jr Awaiting requested review from jr jr is a code owner

Labels

enhancement Issue/PR - Triage size:L