fix(anthropic): normalize prompt_tokens to include cache tokens across all providers#1567
Open
BillionClaw wants to merge 1 commit intoPortkey-AI:mainfrom
Open
Conversation
abc415a to
48eb4d6
Compare
…nthropic and Vertex AI Anthropic direct and Vertex AI set prompt_tokens = input_tokens (excluding cache), while Bedrock included cache tokens. OpenAI convention is that prompt_tokens includes cached tokens. Updated both non-streaming and streaming response transforms to include cache_creation_input_tokens and cache_read_input_tokens in prompt_tokens, and adjusted totalTokens in streaming to avoid double-counting. Fixes Portkey-AI#1564
48eb4d6 to
2c9b69d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
prompt_tokens normalization was inconsistent across Anthropic providers. For the same model with caching enabled, Anthropic direct and Vertex AI set prompt_tokens = input_tokens (excluding cache tokens), while Bedrock included them. OpenAI convention is that prompt_tokens includes cached tokens with breakdown in prompt_tokens_details.cached_tokens.
Updated prompt_tokens in both non-streaming and streaming response transforms for Anthropic direct and Vertex AI to include cache_creation_input_tokens and cache_read_input_tokens, matching Bedrock. Also fixed the streaming totalTokens calculation to avoid double-counting now that prompt_tokens includes cache tokens.
Files changed:
Fixes #1564