Skip to content

prommpt tokens fix#5619

Open
chitalian wants to merge 1 commit intomainfrom
justin/anthropic-fix-price
Open

prommpt tokens fix#5619
chitalian wants to merge 1 commit intomainfrom
justin/anthropic-fix-price

Conversation

@chitalian
Copy link
Copy Markdown
Contributor

Ticket

Link to the ticket(s) this pull request addresses.

Component/Service

What part of Helicone does this affect?

  • Web (Frontend)
  • Jawn (Backend)
  • Worker (Proxy)
  • Bifrost (Marketing)
  • AI Gateway
  • Packages
  • Infrastructure/Docker
  • Documentation

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Performance improvement
  • Refactoring

Deployment Notes

  • No special deployment steps required
  • Database migrations need to run
  • Environment variable changes required
  • Coordination with other teams needed

Screenshots / Demos

Before After

Extra Notes

Any additional context, considerations, or notes for reviewers.

Context

Why are you making this change?

Screenshots / Demos

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
helicone Error Error Mar 3, 2026 1:47am
helicone-bifrost Ready Ready Preview, Comment Mar 3, 2026 1:47am
helicone-eu Error Error Mar 3, 2026 1:47am

Request Review

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@supabase
Copy link
Copy Markdown

supabase bot commented Mar 3, 2026

This pull request has been ignored for the connected project bolqqmqbrciybnypvklh because there are no changes detected in supbase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@supabase
Copy link
Copy Markdown

supabase bot commented Mar 3, 2026

This pull request has been ignored for the connected project lmahfbbnchpworytrrqk because there are no changes detected in .supabase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 3, 2026

Claude finished @chitalian's task —— View job


PR Review Complete ✅

Todo List:

  • Read and analyze the changed files
  • Review test file for coverage and correctness
  • Analyze transformer logic for bugs and issues
  • Check for performance and security concerns
  • Provide detailed feedback with code examples
  • Assign confidence score and summary

Score: 6/10 - Has a critical logic bug in token calculation that needs fixing

Critical Issues

1. Critical Bug: Inconsistent Total Token Calculation 🚨

There's a discrepancy between the test expectations and the implementation logic that reveals a potential bug:

In the streaming transformer (toOpenai.ts:324-328):

total_tokens:
  finalInputTokens +
  finalOutputTokens +
  (cachedTokens ?? 0) +
  (cacheWriteTokens ?? 0),

But the test expects (anthropic-stream-cache-tokens.test.ts:41):

expect(usageChunk.usage.total_tokens).toBe(4 + 51713 + 813);

Problem: The implementation adds cache tokens to total_tokens but keeps prompt_tokens as just input_tokens (4). However, for accurate token accounting, prompt_tokens should typically include cache-related input tokens, and total_tokens should be prompt_tokens + completion_tokens.

2. Logic Inconsistency Between Streaming and Non-Streaming

Non-streaming version (response/toOpenai.ts:61):

total_tokens: anthropicUsage.input_tokens + anthropicUsage.output_tokens + cachedTokens + cacheWriteTokens,

Both implementations:

  • Add cache tokens to total_tokens
  • Keep prompt_tokens as just input_tokens
  • This creates inconsistency where total_tokens ≠ prompt_tokens + completion_tokens

Specific Recommendations

Fix 1: Correct Token Accounting Logic

Current logic issue in toOpenai.ts:322-328:

// ❌ Current - inconsistent accounting
this.finalUsage = {
  prompt_tokens: finalInputTokens, // Only input_tokens (4)
  completion_tokens: finalOutputTokens,
  total_tokens: finalInputTokens + finalOutputTokens + (cachedTokens ?? 0) + (cacheWriteTokens ?? 0), // Includes cache

Recommended fix:

// ✅ Fixed - consistent accounting
this.finalUsage = {
  prompt_tokens: finalInputTokens + (cachedTokens ?? 0) + (cacheWriteTokens ?? 0), // Include cache in prompt
  completion_tokens: finalOutputTokens,
  total_tokens: finalInputTokens + finalOutputTokens + (cachedTokens ?? 0) + (cacheWriteTokens ?? 0),

Apply the same fix to non-streaming version (response/toOpenai.ts:59-61):

// ✅ Fixed
usage: {
  prompt_tokens: anthropicUsage.input_tokens + cachedTokens + cacheWriteTokens,
  completion_tokens: anthropicUsage.output_tokens,
  total_tokens: anthropicUsage.input_tokens + anthropicUsage.output_tokens + cachedTokens + cacheWriteTokens,

Fix 2: Update Test Expectations

In anthropic-stream-cache-tokens.test.ts:38-41:

// ✅ Update to match corrected logic
expect(usageChunk.usage.prompt_tokens).toBe(4 + 51713); // Include cache in prompt_tokens
expect(usageChunk.usage.total_tokens).toBe(4 + 51713 + 813);

Minor Issues

1. Inconsistent Cache Detail Mapping

In streaming (toOpenai.ts:336-343):

cache_write_details: {
  write_5m_tokens:
    this.cacheCreationDetails?.ephemeral_5m_input_tokens ??
    cacheWriteTokens ?? // ❌ Fallback to wrong value
    0,

Should be:

write_5m_tokens:
  this.cacheCreationDetails?.ephemeral_5m_input_tokens ?? 0, // ✅ Only use specific value

2. Test Coverage Gap

The tests don't verify the edge case where both cache_creation_input_tokens and cache_read_input_tokens are present simultaneously.

Recommended additional test:

it("should handle both cache creation and read tokens", () => {
  // Test with both cache_creation_input_tokens: 1000 AND cache_read_input_tokens: 2000
  // to ensure proper handling of mixed cache scenarios
});

Code Quality Observations

Strengths

  • Comprehensive test coverage for different streaming scenarios
  • Good handling of edge cases (no usage data, tool calls, thinking blocks)
  • Clear separation of concerns between streaming and non-streaming logic

Areas for Improvement

  • Token accounting logic needs standardization across both implementations
  • Error handling could be more robust (silent fallbacks may hide issues)

Summary

This PR addresses an important token calculation issue for Anthropic's caching features, but has a critical bug where the token accounting is mathematically inconsistent. The total_tokens includes cache tokens while prompt_tokens doesn't, breaking the standard relationship of total_tokens = prompt_tokens + completion_tokens.

Must fix before merging:

  1. Update token calculation logic in both streaming and non-streaming transformers
  2. Update test expectations to match corrected logic
  3. Ensure consistent behavior across both implementations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant