Skip to content

fix: normalize prompt_tokens to include cache tokens across Anthropic providers#1568

Open
guoyangzhen wants to merge 1 commit intoPortkey-AI:mainfrom
guoyangzhen:fix/anthropic-prompt-tokens-normalization
Open

fix: normalize prompt_tokens to include cache tokens across Anthropic providers#1568
guoyangzhen wants to merge 1 commit intoPortkey-AI:mainfrom
guoyangzhen:fix/anthropic-prompt-tokens-normalization

Conversation

@guoyangzhen
Copy link

Problem

When Portkey normalizes Anthropic model responses to OpenAI schema, prompt_tokens has inconsistent semantics depending on which provider is used:

Provider prompt_tokens includes cache tokens?
anthropic (direct) ❌ No
vertex-ai (Anthropic) ❌ No
bedrock (Anthropic) ✅ Yes

The OpenAI convention (which Portkey normalizes to) is that prompt_tokens includes cached tokens, with the breakdown available in prompt_tokens_details.cached_tokens.

This inconsistency means the same Anthropic model accessed through different providers reports different prompt_tokens values, breaking billing and usage tracking.

Fixes #1564

Solution

Normalize prompt_tokens to always include cache_read_input_tokens + cache_creation_input_tokens, matching Bedrock's (correct) behavior and the OpenAI convention.

Changes

  1. Anthropic direct (non-stream)src/providers/anthropic/chatComplete.ts:613
    • prompt_tokens now includes cache creation and read tokens
  2. Anthropic direct (streaming) — same file, message_start handler
    • Same fix applied to streaming chunk
  3. Vertex AI Anthropic (non-stream)src/providers/google-vertex-ai/chatComplete.ts:899
    • Same fix
  4. Vertex AI Anthropic (streaming) — same file
    • Same fix

Before

prompt_tokens: input_tokens,  // excludes cache tokens

After

prompt_tokens: input_tokens + cache_creation_input_tokens + cache_read_input_tokens,

Testing

Verified the change matches Bedrock's existing behavior at src/providers/bedrock/chatComplete.ts:551-554:

prompt_tokens:
  response.usage.inputTokens +
  cacheReadInputTokens +
  cacheWriteInputTokens,

… providers

The OpenAI convention (which Portkey normalizes to) is that prompt_tokens
includes cached tokens, with the breakdown available in
prompt_tokens_details.cached_tokens.

Previously, Bedrock correctly included cache tokens in prompt_tokens,
but the Anthropic direct API and Vertex AI Anthropic paths did not.
This caused inconsistent billing and usage tracking when the same
Anthropic model was accessed through different providers.

Changes:
- Anthropic direct (non-stream): prompt_tokens now includes cache tokens
- Anthropic direct (streaming): same fix in message_start handler
- Vertex AI Anthropic (non-stream): same fix
- Vertex AI Anthropic (streaming): same fix

Fixes Portkey-AI#1564
@Portkey-AI Portkey-AI deleted a comment from CYzhr Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Inconsistent prompt_tokens normalization across Anthropic providers (direct API vs Vertex AI vs Bedrock)

2 participants