Skip to content

fix(google): avoid double-counting cached tokens in input_token_details.text#10440

Open
guoyangzhen (guoyangzhen) wants to merge 1 commit intolangchain-ai:mainfrom
guoyangzhen:fix/google-cache-token-double-count
Open

fix(google): avoid double-counting cached tokens in input_token_details.text#10440
guoyangzhen (guoyangzhen) wants to merge 1 commit intolangchain-ai:mainfrom
guoyangzhen:fix/google-cache-token-double-count

Conversation

@guoyangzhen

Problem

When using @langchain/google with Gemini models that support implicit caching, input_token_details.text includes all text tokens (including cached ones), while input_token_details.cache_read also counts the cached tokens. This causes LangSmith to double-count cached tokens when calculating costs.

LangSmith calculates cost as:

input_cost = (text × regular_price) + (cache_read × discounted_price)

Since text includes cached tokens, those tokens are charged at both the regular rate and the discounted cache rate.

Root Cause

In convertGeminiGenerateContentResponseToUsageMetadata, Gemini's promptTokensDetails[TEXT] modality count equals the total text tokens (including cached). The code sets both text and cache_read without adjusting for the overlap.

Fix

Subtract cache_read from text modality count so the values don't overlap, matching how cost is calculated in LangSmith.

Fixes #10339

…ls.text

Gemini's promptTokensDetails[TEXT] includes cached tokens, but LangSmith
calculates cost as (text * regular_price) + (cache_read * discounted_price).
This caused cached tokens to be billed twice.

Subtract cache_read from text modality count to avoid overlap.

Fixes langchain-ai#10339
@changeset-bot
Copy link

changeset-bot bot commented Mar 17, 2026

⚠️ No Changeset found

Latest commit: e0b7aaa

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@langchain/google: input_token_details.text double-counts cached tokens, inflating LangSmith cost estimates

1 participant