Skip to content

fix: fix anthropic response conversion token mappings#5652

Merged
chitalian merged 4 commits intomainfrom
hammad/fix/anthropic-token-count
Apr 4, 2026
Merged

fix: fix anthropic response conversion token mappings#5652
chitalian merged 4 commits intomainfrom
hammad/fix/anthropic-token-count

Conversation

@H2Shami
Copy link
Copy Markdown
Collaborator

@H2Shami H2Shami commented Apr 3, 2026

Ticket

Link to the ticket(s) this pull request addresses.

Component/Service

What part of Helicone does this affect?

  • Web (Frontend)
  • Jawn (Backend)
  • Worker (Proxy)
  • Bifrost (Marketing)
  • AI Gateway
  • Packages
  • Infrastructure/Docker
  • Documentation

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Performance improvement
  • Refactoring

Deployment Notes

  • No special deployment steps required
  • Database migrations need to run
  • Environment variable changes required
  • Coordination with other teams needed

Screenshots / Demos

Before After

Extra Notes

Any additional context, considerations, or notes for reviewers.

Context

Why are you making this change?

Screenshots / Demos

Copilot AI review requested due to automatic review settings April 3, 2026 23:52
@mintlify
Copy link
Copy Markdown

mintlify bot commented Apr 3, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
helicone 🟢 Ready View Preview Apr 3, 2026, 11:53 PM

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
helicone Error Error Apr 4, 2026 0:20am
helicone-eu Error Error Apr 4, 2026 0:20am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
helicone-bifrost Skipped Skipped Apr 4, 2026 0:20am

Request Review

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@supabase
Copy link
Copy Markdown

supabase bot commented Apr 3, 2026

This pull request has been ignored for the connected project bolqqmqbrciybnypvklh because there are no changes detected in supbase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@supabase
Copy link
Copy Markdown

supabase bot commented Apr 3, 2026

This pull request has been ignored for the connected project lmahfbbnchpworytrrqk because there are no changes detected in .supabase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 3, 2026

Claude finished @H2Shami's task —— View job


PR Review: Anthropic Token Mapping Fix

Reviewing the token counting fix for Anthropic responses. Here's my analysis plan:

  • Review core token mapping logic changes
  • Assess debug logging additions for security/performance concerns
  • Evaluate type changes in generated files
  • Check test coverage and updates
  • Provide actionable recommendations with confidence score

Core Token Mapping Analysis ✅

Looking at the main fix in packages/llm-mapper/transform/providers/anthropic/streamedResponse/toOpenai.ts:316:

The Fix: Changed from inputTokens to inputTokens + (cachedTokens ?? 0) for prompt_tokens calculation. This ensures that when converting Anthropic responses to OpenAI format, cached tokens are included in the prompt_tokens count to match OpenAI conventions.

Logic: The change addresses the fundamental difference between Anthropic and OpenAI token counting:

  • Anthropic: input_tokens excludes cached tokens, cache tokens reported separately
  • OpenAI: prompt_tokens includes all tokens processed for the prompt

Guard Rails Added: Multiple protective checks in usage processors to prevent negative token counts when cachedTokens > promptTokens, indicating data is already in non-cached format.

Test Coverage: Tests verify both cache read and cache write scenarios work correctly, including edge cases with no caching.

Critical Security/Performance Issues ⚠️

Problem: Raw request/response body logging added to production code

  • valhalla/jawn/src/lib/handlers/ResponseBodyHandler.ts:297-298 logs full request/response bodies
  • valhalla/jawn/src/lib/handlers/ResponseBodyHandler.ts:77 logs full request log objects

Security Risk: These logs can expose:

  • User prompts and AI responses (potentially containing PII)
  • API keys and authentication tokens
  • Large request/response payloads causing performance degradation

Recommendation: Remove debug logging or gate behind disabled-by-default debug flag with redaction.

Type Definition Issues 🔧

Problem: Generated Json type narrowed from JsonValue to JsonObject

  • Affects multiple files: web/lib/clients/jawnTypes/{public,private}.ts, bifrost/lib/clients/jawnTypes/{public,private}.ts
  • Now excludes valid JSON primitives (strings, numbers, arrays, null)

Impact: Breaking change for any API fields that can contain arbitrary JSON values

Recommendation: Restore Json to recursive JsonValue type and regenerate from OpenAPI spec rather than hand-editing.

API Surface Changes 📝

Issue: MCP types and swagger.json remove multiple /v1/public/* endpoints

  • Removes waitlist, stats, alert-banner endpoints
  • Appears unrelated to token mapping fix

Recommendation: Verify if endpoint removal is intentional or regenerate from correct OpenAPI source.

Code Quality 👍

Strengths:

  • Well-structured guard rails preventing negative token calculations
  • Comprehensive test coverage for edge cases
  • Clear separation of concerns between different token counting conventions

Minor Issue: Typo in log tag: "RAW_RESONSE_BODY" should be "RAW_RESPONSE_BODY"


Score: 4/10 - Core fix is solid but critical security/logging issues prevent merge

Critical Fixes Needed:

  • Remove raw body logging from production code (security risk)
  • Fix Json type definition regression (breaking change)
  • Verify unintended API endpoint removals

Suggestions Summary:

  • Gate debug logging behind feature flag with redaction
  • Regenerate types from OpenAPI spec instead of hand-editing
  • Fix typo in log message tag
  • Consider adding integration tests for end-to-end token counting scenarios

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR primarily targets correcting token/usage accounting when converting and processing Anthropic (and OpenAI Responses) usage data—especially around cache read/write tokens—so downstream cost/usage calculations align with OpenAI-style conventions.

Changes:

  • Adjust effective prompt token calculations to handle cases where cached tokens should not be subtracted (Anthropic-style conventions) across stream/body/usage processors.
  • Extend Responses usage types/mappers to carry cache write token fields (cache_write_tokens, cache_write_details) through conversions.
  • Update generated OpenAPI artifacts (jawnTypes, MCP public types, swagger) and add extensive debug logging in worker/Jawn.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
worker/src/lib/dbLogger/DBLoggable.ts Adds raw/normalized response debug logging around AI Gateway normalization and usage extraction.
valhalla/jawn/src/lib/shared/bodyProcessors/openAIStreamProcessor.ts Updates effective prompt token logic for Responses + chat-completions streaming usage.
valhalla/jawn/src/lib/shared/bodyProcessors/genericBodyProcessor.ts Updates effective prompt token calculation heuristic for generic JSON usage payloads.
valhalla/jawn/src/lib/handlers/ResponseBodyHandler.ts Adds debug logging for request info and raw request/response bodies; minor formatting changes.
packages/llm-mapper/transform/types/responses.ts Extends ResponsesUsage to include cache write token fields/details.
packages/llm-mapper/transform/providers/responses/streamedResponse/toResponses.ts Maps OpenAI-style prompt_tokens_details.cache_write_* into Responses input_tokens_details.
packages/llm-mapper/transform/providers/responses/openai/response/toResponses.ts Same mapping for non-stream OpenAI response -> Responses conversion.
packages/llm-mapper/transform/providers/anthropic/streamedResponse/toOpenai.ts Changes Anthropic stream conversion so prompt_tokens includes cached read tokens (OpenAI convention).
packages/cost/usage/openAIUsageProcessor.ts Adjusts effective prompt token calculation to avoid subtracting cached tokens when cached > prompt_tokens.
packages/cost/usage/mapModelUsageToOpenAI.ts Minor formatting-only change (blank line).
packages/tests/llm-mapper/anthropic-stream-cache-tokens.test.ts Updates expectation so prompt_tokens includes cache read tokens after Anthropic stream conversion.
web/lib/clients/jawnTypes/public.ts Alters generated JSON schema typing (introduces JsonValue/JsonObject, sets Json to JsonObject).
web/lib/clients/jawnTypes/private.ts Same generated JSON schema typing change as public types.
bifrost/lib/clients/jawnTypes/public.ts Same generated JSON schema typing change as web public types.
bifrost/lib/clients/jawnTypes/private.ts Same generated JSON schema typing change as web private types.
helicone-mcp/src/types/public.ts Generated MCP public types remove multiple /v1/public/* endpoints/schemas.
docs/swagger.json Generated swagger removes multiple /v1/public/* endpoints/schemas and restructures parts of the spec.
Comments suppressed due to low confidence (3)

web/lib/clients/jawnTypes/public.ts:1616

  • components.schemas.Json is being narrowed to JsonObject, which excludes valid JSON root values (arrays, strings, numbers, booleans, null). This will make generated client types incorrect for fields that can contain arbitrary JSON (e.g., metadata/settings) and may break consumers. Consider defining Json as a recursive JsonValue (or unknown) and regenerate via the OpenAPI generator rather than hand-editing this file.
      parentPromptVersionId: string;
    };
Json: JsonObject;
    ExperimentV2PromptVersion: {
      created_at: string | null;
      experiment_id: string | null;
      helicone_template: components["schemas"]["Json"] | null;
      id: string;

bifrost/lib/clients/jawnTypes/public.ts:1616

  • components.schemas.Json is being changed to JsonObject, which excludes valid JSON values like arrays and primitives. This makes the generated client types incorrect for arbitrary JSON fields and can break downstream usage. Please restore Json to a recursive JsonValue (or unknown) and regenerate rather than editing the generated output.
      parentPromptVersionId: string;
    };
Json: JsonObject;
    ExperimentV2PromptVersion: {
      created_at: string | null;
      experiment_id: string | null;
      helicone_template: components["schemas"]["Json"] | null;
      id: string;

helicone-mcp/src/types/public.ts:395

  • This generated MCP OpenAPI types file removes multiple /v1/public/* endpoints (waitlist, stats, alert-banner) compared to the previous version, which is a potentially breaking API surface change and appears unrelated to the PR's stated focus on Anthropic token mappings. Please confirm this removal is intentional and that downstream consumers have been updated; otherwise regenerate from the correct/updated swagger source.
    delete: operations["DeleteWebhook"];
  };
  "/v1/webhooks/{webhookId}/test": {
    post: operations["TestWebhook"];
  };
  "/v1/vault/add": {
    post: operations["AddKey"];
  };
  "/v1/vault/keys": {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +807 to +832
// DEBUG: Log raw response before any processing
try {
const rawText = rawResponseBody.join("");
// For streaming, just grab the last few lines (where usage lives)
const lines = rawText.split("\n").filter((l: string) => l.trim());
const lastLines = lines.slice(-5);
console.log("[RAW_RESPONSE] isStream:", this.request.isStream, "provider:", this.request.provider);
console.log("[RAW_RESPONSE] last lines:", lastLines.map((l: string) => l.substring(0, 500)).join("\n"));
// Try to extract usage from the raw response
try {
const parsed = JSON.parse(rawText);
if (parsed?.usage) {
console.log("[RAW_RESPONSE] usage:", JSON.stringify(parsed.usage));
}
} catch {
// Streaming - look for usage in last data lines
for (const line of lastLines) {
if (line.includes('"usage"') && line.startsWith("data:")) {
try {
const chunk = JSON.parse(line.replace("data: ", ""));
if (chunk?.usage) console.log("[RAW_RESPONSE] stream usage chunk:", JSON.stringify(chunk.usage));
} catch {}
}
}
}
} catch (e) {
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added raw-response debug logging joins and prints response body content (including streaming chunks) to stdout. This can leak sensitive data (prompts/completions, PII, keys) and adds significant CPU/memory overhead on every request. Please remove this, or gate it behind a disabled-by-default debug flag with strict redaction/size limits.

Suggested change
// DEBUG: Log raw response before any processing
try {
const rawText = rawResponseBody.join("");
// For streaming, just grab the last few lines (where usage lives)
const lines = rawText.split("\n").filter((l: string) => l.trim());
const lastLines = lines.slice(-5);
console.log("[RAW_RESPONSE] isStream:", this.request.isStream, "provider:", this.request.provider);
console.log("[RAW_RESPONSE] last lines:", lastLines.map((l: string) => l.substring(0, 500)).join("\n"));
// Try to extract usage from the raw response
try {
const parsed = JSON.parse(rawText);
if (parsed?.usage) {
console.log("[RAW_RESPONSE] usage:", JSON.stringify(parsed.usage));
}
} catch {
// Streaming - look for usage in last data lines
for (const line of lastLines) {
if (line.includes('"usage"') && line.startsWith("data:")) {
try {
const chunk = JSON.parse(line.replace("data: ", ""));
if (chunk?.usage) console.log("[RAW_RESPONSE] stream usage chunk:", JSON.stringify(chunk.usage));
} catch {}
}
}
}
} catch (e) {
try {

Copilot uses AI. Check for mistakes.
Comment on lines +907 to +925
console.log("[NORMALIZE] params:", JSON.stringify({ isStream: normParams.isStream, provider: normParams.provider, responseFormat: normParams.responseFormat, bodyMapping: normParams.bodyMapping }));
openAIResponse = await normalizeAIGatewayResponse(normParams);
// Log the normalized usage
try {
if (normParams.isStream) {
// For streams, find the last chunk with usage
const normLines = openAIResponse.split("\n").filter((l: string) => l.startsWith("data: ") && l.includes('"usage"'));
const lastUsageLine = normLines[normLines.length - 1];
if (lastUsageLine) {
const chunk = JSON.parse(lastUsageLine.replace("data: ", ""));
console.log("[NORMALIZED] stream usage:", JSON.stringify(chunk.usage || chunk.response?.usage));
}
} else {
const normParsed = JSON.parse(openAIResponse);
console.log("[NORMALIZED] usage:", JSON.stringify(normParsed.usage));
}
} catch (e) {
console.log("[NORMALIZED] failed to log:", e);
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This normalization debug logging prints normalization parameters and attempts to parse/log usage from the normalized response. Besides potential sensitive-data exposure, openAIResponse.split(...) + JSON.parse in the hot path can be expensive for large streams. Please remove these logs or guard them behind a debug flag and ensure output is redacted/truncated.

Suggested change
console.log("[NORMALIZE] params:", JSON.stringify({ isStream: normParams.isStream, provider: normParams.provider, responseFormat: normParams.responseFormat, bodyMapping: normParams.bodyMapping }));
openAIResponse = await normalizeAIGatewayResponse(normParams);
// Log the normalized usage
try {
if (normParams.isStream) {
// For streams, find the last chunk with usage
const normLines = openAIResponse.split("\n").filter((l: string) => l.startsWith("data: ") && l.includes('"usage"'));
const lastUsageLine = normLines[normLines.length - 1];
if (lastUsageLine) {
const chunk = JSON.parse(lastUsageLine.replace("data: ", ""));
console.log("[NORMALIZED] stream usage:", JSON.stringify(chunk.usage || chunk.response?.usage));
}
} else {
const normParsed = JSON.parse(openAIResponse);
console.log("[NORMALIZED] usage:", JSON.stringify(normParsed.usage));
}
} catch (e) {
console.log("[NORMALIZED] failed to log:", e);
}
openAIResponse = await normalizeAIGatewayResponse(normParams);

Copilot uses AI. Check for mistakes.
});

try {
console.log("Processing response body for request:", context.message.log.request);
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handler now logs the entire request log object for every response. Request logs can contain sensitive identifiers/headers and this also increases log volume. Please remove this or gate it behind a debug flag with redaction.

Suggested change
console.log("Processing response body for request:", context.message.log.request);

Copilot uses AI. Check for mistakes.
Comment on lines 295 to 299
let responseBody = context.rawLog.rawResponseBody;
const requestBody = context.rawLog.rawRequestBody;
console.log("[RAW_RESONSE_BODY]", responseBody);
console.log("[RAW_REQUEST_BODY]", requestBody);

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging raw response and request bodies here can leak sensitive data (prompt/response content, API keys) and can be extremely large, impacting performance and log costs. Also, the log tag has a typo ("RAW_RESONSE_BODY"). Please remove/gate these logs behind a debug flag with redaction and fix the tag spelling if kept.

Copilot uses AI. Check for mistakes.
Comment on lines 1098 to 1106
gbCost: number;
/** Format: double */
requestsCost: number;
};
};
Json: (string | number | boolean | {
[key: string]: components["schemas"]["Json"];
} | components["schemas"]["Json"][]) | null;
Json: JsonObject;
"ResultSuccess__40_Database-at-public_91_Tables_93_-at-organization_91_Row_93_-and-_role-string__41_-Array_": {
data: (({
tier: string | null;
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

components.schemas.Json is now JsonObject, which disallows non-object JSON roots (arrays/primitives/null). If these schemas are meant to represent arbitrary JSON blobs (as the previous recursive union did), this change makes the generated types inaccurate and can break callers. Please restore Json to a recursive JsonValue (or unknown) and address via generator config rather than manual edits.

Copilot uses AI. Check for mistakes.
Comment on lines 1098 to 1106
gbCost: number;
/** Format: double */
requestsCost: number;
};
};
Json: (string | number | boolean | {
[key: string]: components["schemas"]["Json"];
} | components["schemas"]["Json"][]) | null;
Json: JsonObject;
"ResultSuccess__40_Database-at-public_91_Tables_93_-at-organization_91_Row_93_-and-_role-string__41_-Array_": {
data: (({
tier: string | null;
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

components.schemas.Json is now typed as JsonObject, which disallows non-object JSON roots (arrays/primitives/null). If the API schema intends this to represent arbitrary JSON, this is a breaking type regression for generated clients. Please restore the previous recursive JSON union (or use unknown) and regenerate via the OpenAPI tooling.

Copilot uses AI. Check for mistakes.
Comment on lines 20110 to 20145
"/v1/webhooks/{webhookId}/test": {
"post": {
"operationId": "TestWebhook",
"responses": {
"200": {
"description": "Ok",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Result__success-boolean--message-string_.string_"
}
}
}
}
},
"tags": [
"Webhooks"
],
"security": [
{
"api_key": []
}
],
"parameters": [
{
"in": "path",
"name": "webhookId",
"required": true,
"schema": {
"type": "string"
}
}
]
}
},
"/v1/public/waitlist/feature": {
"post": {
"operationId": "AddToWaitlist",
"responses": {
"200": {
"description": "Ok",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Result__success-boolean--position_63_-number--alreadyOnList_63_-boolean--sharedPlatforms_63_-string-Array_.string_"
}
}
}
}
},
"tags": [
"Waitlist"
],
"security": [],
"parameters": [],
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"properties": {
"feature": {
"type": "string"
},
"email": {
"type": "string"
}
},
"required": [
"feature",
"email"
],
"type": "object"
}
}
}
}
}
},
"/v1/public/waitlist/feature/status": {
"get": {
"operationId": "IsOnWaitlist",
"responses": {
"200": {
"description": "Ok",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Result__isOnWaitlist-boolean_.string_"
}
}
}
}
},
"tags": [
"Waitlist"
],
"security": [],
"parameters": [
{
"in": "query",
"name": "email",
"required": true,
"schema": {
"type": "string"
}
},
{
"in": "query",
"name": "feature",
"required": true,
"schema": {
"type": "string"
}
}
]
}
},
"/v1/public/waitlist/feature/count": {
"get": {
"operationId": "GetWaitlistCount",
"responses": {
"200": {
"description": "Ok",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Result__count-number_.string_"
}
}
}
}
},
"tags": [
"Waitlist"
],
"security": [],
"parameters": [
{
"in": "query",
"name": "feature",
"required": true,
"schema": {
"type": "string"
}
}
]
}
},
"/v1/public/waitlist/feature/share": {
"post": {
"operationId": "TrackShare",
"responses": {
"200": {
"description": "Ok",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Result__success-boolean--newPosition_63_-number--message-string_.string_"
}
}
}
}
},
"tags": [
"Waitlist"
],
"security": [],
"parameters": [],
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"properties": {
"platform": {
"type": "string",
"enum": [
"twitter",
"linkedin"
]
},
"feature": {
"type": "string"
},
"email": {
"type": "string"
}
},
"required": [
"platform",
"feature",
"email"
],
"type": "object"
"responses": {
"200": {
"description": "Ok",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Result__success-boolean--message-string_.string_"
}
}
}
}
}
},
"tags": [
"Webhooks"
],
"security": [
{
"api_key": []
}
],
"parameters": [
{
"in": "path",
"name": "webhookId",
"required": true,
"schema": {
"type": "string"
}
}
]
}
},
"/v1/vault/add": {
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/swagger.json no longer contains several /v1/public/* endpoints (e.g., waitlist/stats/alert-banner). This is a potentially breaking documentation/spec change and seems out of scope for a token-mapping bug fix PR. Please confirm the underlying API actually removed these routes (and update callers), or regenerate swagger from the intended source so docs/types stay consistent.

Copilot uses AI. Check for mistakes.
Comment on lines +166 to +171
// If cached > prompt_tokens, the data follows Anthropic convention where
// prompt_tokens is already the non-cached input count. Don't subtract.
// Otherwise assume OpenAI convention where prompt_tokens includes cached.
const effectivePromptTokens = cachedTokens > promptTokens
? Math.max(0, promptTokens - promptAudioTokens)
: Math.max(0, promptTokens - cachedTokens - promptAudioTokens);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory this shouldn't execute, but its a guard rail so that the issue we just solved won't happen again in the future

const inputTokens = event.usage.input_tokens ?? this.inputTokens;
this.finalUsage = {
prompt_tokens: event.usage.input_tokens ?? this.inputTokens,
prompt_tokens: inputTokens + (cachedTokens ?? 0),
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the fix

Comment on lines +81 to +89
// If cached > prompt_tokens, the data follows Anthropic convention where
// prompt_tokens is already the non-cached input count. Don't subtract.
const usage = response.usage;
const effectivePromptTokens = usage?.prompt_tokens !== undefined
? Math.max(0, (usage.prompt_tokens ?? 0) - (usage.prompt_tokens_details?.cached_tokens ?? 0) - (usage.prompt_tokens_details?.audio_tokens ?? 0))
: Math.max(0, (usage.input_tokens ?? 0) - (usage.input_tokens_details?.cached_tokens ?? 0));
const gPromptToks = usage?.prompt_tokens ?? usage?.input_tokens ?? 0;
const gCachedToks = usage?.prompt_tokens_details?.cached_tokens ?? usage?.input_tokens_details?.cached_tokens ?? 0;
const gAudioToks = usage?.prompt_tokens_details?.audio_tokens ?? 0;
const effectivePromptTokens = gCachedToks > gPromptToks
? Math.max(0, gPromptToks - gAudioToks)
: Math.max(0, gPromptToks - gCachedToks - gAudioToks);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another guard rail

Comment on lines +105 to +109
const rInputToks = usageData.input_tokens ?? 0;
const rCachedToks = usageData.input_tokens_details?.cached_tokens ?? 0;
const effectivePromptTokens = rCachedToks > rInputToks
? rInputToks
: Math.max(0, rInputToks - rCachedToks);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard rail

Comment on lines +154 to +162
// If cached > prompt_tokens, the data follows Anthropic convention where
// prompt_tokens is already the non-cached input count. Don't subtract.
const promptToks = usageData.prompt_tokens ?? usageData.input_tokens ?? 0;
const cachedToks = usageData.prompt_tokens_details?.cached_tokens
?? usageData.input_tokens_details?.cached_tokens ?? 0;
const audioToks = usageData.prompt_tokens_details?.audio_tokens ?? 0;
const effectivePromptTokens = cachedToks > promptToks
? Math.max(0, promptToks - audioToks)
: Math.max(0, promptToks - cachedToks - audioToks);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard rail

@chitalian chitalian merged commit aa7ea34 into main Apr 4, 2026
11 of 17 checks passed
@chitalian chitalian deleted the hammad/fix/anthropic-token-count branch April 4, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants