fix(usage): map OpenAI cached_tokens to _cache_read_input_tokens #20878

michelligabriele · 2026-02-10T16:52:14Z

Relevant issues

Fixes #19684

Pre-Submission checklist

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🐛 Bug Fix

Changes

OpenAI models return cached token counts in usage.prompt_tokens_details.cached_tokens, but LiteLLM was not mapping this value to the internal _cache_read_input_tokens private attribute on the Usage class.

This caused the admin UI to display "Cache Read Tokens: 0" for OpenAI/Azure requests even when prompt caching was active (visible in response metadata). Cost calculation was unaffected since it reads directly from prompt_tokens_details.cached_tokens.

Root cause: Anthropic and Bedrock explicitly pass cache_read_input_tokens when constructing Usage, which sets the private attr. OpenAI's usage dict only contains prompt_tokens_details.cached_tokens, and no mapping existed from that field to _cache_read_input_tokens.

Fix: In Usage.__init__, after all provider-specific mappings (Anthropic, DeepSeek), populate _cache_read_input_tokens from prompt_tokens_details.cached_tokens if it hasn't already been set. This is safe because the == 0 guard prevents overwriting values set by other providers.

Files changed:

litellm/types/utils.py - Added fallback mapping in Usage.__init__
tests/test_litellm/types/test_types_utils.py - Added 4 unit tests covering OpenAI cached tokens mapping, zero value, Anthropic non-overwrite, and no prompt_tokens_details

vercel · 2026-02-10T16:52:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 10, 2026 5:28pm

greptile-apps · 2026-02-10T16:58:16Z

Greptile Overview

Greptile Summary

This PR updates Usage.__init__ to backfill the private _cache_read_input_tokens field from OpenAI-style prompt_tokens_details.cached_tokens when other provider mappings haven’t set it. This aligns OpenAI/Azure usage parsing with existing Anthropic/DeepSeek mappings so the admin UI can display cache read tokens correctly.

Unit tests were added to cover the OpenAI cached tokens mapping behavior and a few edge cases around missing/zero values.

Confidence Score: 3/5

This PR is close to mergeable but has a couple correctness/test-coverage gaps that should be addressed first.
Core mapping is straightforward and localized, but the OpenAI mapping currently skips the explicit-zero case and one added test won’t detect overwrites because it uses identical values for both sources.
litellm/types/utils.py, tests/test_litellm/types/test_types_utils.py

Important Files Changed

Filename	Overview
litellm/types/utils.py	Adds an OpenAI-specific fallback mapping in `Usage.__init__` to populate `_cache_read_input_tokens` from `prompt_tokens_details.cached_tokens` when not already set by other provider mappings.
tests/test_litellm/types/test_types_utils.py	Adds tests for OpenAI cached token mapping and related edge cases; one non-overwrite test currently can’t detect regressions because it uses identical values.

Sequence Diagram

sequenceDiagram
  participant Provider as OpenAI/Azure response
  participant UsageInit as Usage.__init__
  participant PromptDetails as PromptTokensDetailsWrapper
  participant UI as Admin UI

  Provider->>UsageInit: Construct Usage(**usage_dict)
  UsageInit->>PromptDetails: Wrap prompt_tokens_details dict
  UsageInit->>UsageInit: Apply provider mappings (Anthropic/DeepSeek)
  UsageInit->>UsageInit: If _cache_read_input_tokens == 0 and cached_tokens > 0
  UsageInit->>UsageInit: Set _cache_read_input_tokens = cached_tokens
  UsageInit->>UI: UI reads cache read tokens from Usage._cache_read_input_tokens

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T16:58:20Z

litellm/types/utils.py

+        ## OPENAI MAPPING - populate _cache_read_input_tokens from prompt_tokens_details.cached_tokens ##
+        if (
+            self._cache_read_input_tokens == 0
+            and _prompt_tokens_details is not None
+            and _prompt_tokens_details.cached_tokens is not None
+            and _prompt_tokens_details.cached_tokens > 0
+        ):
+            self._cache_read_input_tokens = _prompt_tokens_details.cached_tokens


Cache tokens lost when None

Usage.__init__ only sets _cache_read_input_tokens from prompt_tokens_details.cached_tokens when it’s > 0 (litellm/types/utils.py:1559-1566). If OpenAI returns a valid cached_tokens value of 0 (or the UI expects to reflect that the field was present), this mapping won’t run and you can’t distinguish “absent” vs “present but zero”. Since the PR intent is to map OpenAI’s field, consider setting _cache_read_input_tokens when cached_tokens is not None (and still keep the “don’t overwrite provider-set value” guard).

greptile-apps · 2026-02-10T16:58:21Z

tests/test_litellm/types/test_types_utils.py

+    # Anthropic passes cache_read_input_tokens explicitly in **params
+    usage = Usage(
+        prompt_tokens=1000,
+        completion_tokens=50,
+        total_tokens=1050,
+        prompt_tokens_details={"cached_tokens": 500},
+        cache_read_input_tokens=500,
+    )
+
+    # Should use the explicit Anthropic value, not overwrite it
+    assert usage._cache_read_input_tokens == 500
+    assert usage.prompt_tokens_details.cached_tokens == 500


Non-overwrite test is weak

In test_usage_anthropic_cache_read_not_overwritten_by_prompt_details the two inputs that could conflict are identical, so the test can’t detect an overwrite regression (tests/test_litellm/types/test_types_utils.py:200-206). Change the test to pass different values for the two fields and assert the explicit cache-read value is preserved.

michelligabriele · 2026-02-10T17:07:14Z

@greptile can you review the last commit?

greptile-apps · 2026-02-10T17:17:30Z

Greptile Overview

Greptile Summary

This PR updates the Usage model construction to populate the private _cache_read_input_tokens field from OpenAI’s prompt_tokens_details.cached_tokens when no provider-specific cache-read value was already set. It also adds unit tests covering OpenAI cached token mapping, zero values, a non-overwrite scenario, and absence of prompt_tokens_details.

The change lives in litellm/types/utils.py within Usage.__init__, alongside existing provider-specific mappings (Anthropic/DeepSeek), and is intended to fix UI reporting of cache-read tokens for OpenAI/Azure responses without affecting cost computation.

Confidence Score: 3/5

This PR is close to mergeable, but one added unit test assertion appears incorrect and will likely fail in CI.
The runtime mapping change is small and aligned with existing provider-specific token mapping logic, but the new Anthropic non-overwrite test asserts a cached_tokens value that does not match the constructor’s attribute-application order in Usage.__init__. Fixing that test should make the PR safer to merge.
tests/test_litellm/types/test_types_utils.py

Important Files Changed

Filename	Overview
litellm/types/utils.py	Adds a fallback mapping in `Usage.__init__` to set `_cache_read_input_tokens` from `prompt_tokens_details.cached_tokens` when not already set.
tests/test_litellm/types/test_types_utils.py	Adds tests for OpenAI cached_tokens mapping, but includes a failing assertion in the Anthropic non-overwrite test due to how `prompt_tokens_details` is applied at the end of `Usage.__init__`.

Sequence Diagram

sequenceDiagram
    participant Provider as ProviderResponse
    participant UsageInit as UsageInit
    participant Ptd as PromptTokensDetails

    Provider->>UsageInit: construct with usage dict
    UsageInit->>UsageInit: parse prompt token details

    alt explicit cache-read provided
        UsageInit->>UsageInit: set private cache-read tokens
        UsageInit->>Ptd: set cached_tokens
    else provider cache-hit provided
        UsageInit->>UsageInit: set private cache-read tokens
    else OpenAI cached_tokens present
        UsageInit->>UsageInit: if private cache-read is zero
        UsageInit->>UsageInit: set private cache-read from cached_tokens
    end

    UsageInit->>UsageInit: apply remaining params to model
    UsageInit-->>Provider: return Usage object

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T17:17:34Z

tests/test_litellm/types/test_types_utils.py

+    # Should use the explicit Anthropic value (500), not the prompt_tokens_details value (300)
+    assert usage._cache_read_input_tokens == 500


Incorrect cached_tokens assertion

With prompt_tokens_details passed in, that value is applied after the internal mappings, so prompt_tokens_details.cached_tokens ends up 300 here (while _cache_read_input_tokens remains 500).

…t_tokens

michelligabriele · 2026-02-10T17:29:02Z

@greptile please re-review the latest commits. Here's what was addressed:

> 0 guard removed (first review comment): Now uses is not None, so cached_tokens: 0 is correctly distinguished from absent.
Weak overwrite test (first review comment): Anthropic test now uses distinct values (cache_read_input_tokens=500 vs prompt_tokens_details.cached_tokens=300) to properly detect overwrites.
"Incorrect cached_tokens assertion" (second review comment): The assertion prompt_tokens_details.cached_tokens == 500 is actually correct — the Anthropic mapping at Usage.init line ~1502 explicitly overwrites _prompt_tokens_details.cached_tokens with the cache_read_input_tokens value before super().init is called. Added a
docstring explaining this flow.

Additionally, expanded test coverage with 3 new tests:

DeepSeek: verifies prompt_cache_hit_tokens is not overwritten by the OpenAI fallback
cached_tokens=None: verifies explicit None doesn't set _cache_read_input_tokens
Round-trip serialization: verifies _cache_read_input_tokens survives model_dump() → Usage() cycle

greptile-apps · 2026-02-10T17:47:43Z

Greptile Overview

Greptile Summary

This PR updates Usage.__init__ to backfill LiteLLM’s internal cache-read counter (_cache_read_input_tokens) from OpenAI/Azure’s usage.prompt_tokens_details.cached_tokens when no provider-specific mapping has already set it (Anthropic via cache_read_input_tokens, DeepSeek via prompt_cache_hit_tokens). It also adds tests intended to cover the OpenAI mapping, non-overwrite behavior for other providers, and a model_dump round-trip.

Overall, the mapping change is small and localized to litellm/types/utils.py, and it fits the existing pattern of provider-specific usage normalization used to power downstream reporting/UI.

Confidence Score: 4/5

Generally safe to merge; localized change, but test expectations should be aligned with the intended mapping semantics.
The core code change only backfills the internal cache-read counter from an already-parsed provider field when other provider mappings have not set it, which keeps the blast radius small. Main remaining risk is that the new tests encode a confusing contract around zero values, which can mislead future maintenance.
tests/test_litellm/types/test_types_utils.py

Important Files Changed

Filename	Overview
litellm/types/utils.py	Adds OpenAI fallback mapping to populate Usage._cache_read_input_tokens from prompt_tokens_details.cached_tokens when not already set by other provider mappings.

Sequence Diagram

sequenceDiagram
    participant Provider as OpenAI/Azure
    participant LiteLLM as LiteLLM Usage.__init__
    participant UI as Admin UI

    Provider->>LiteLLM: usage dict (prompt_tokens_details.cached_tokens)
    LiteLLM->>LiteLLM: Build PromptTokensDetailsWrapper
    LiteLLM->>LiteLLM: Apply provider mappings (Anthropic/DeepSeek)
    alt _cache_read_input_tokens still 0 and cached_tokens != None
        LiteLLM->>LiteLLM: _cache_read_input_tokens = cached_tokens
    end
    LiteLLM->>UI: Usage object (includes _cache_read_input_tokens)
    UI->>UI: Display Cache Read Tokens

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T17:47:47Z

tests/test_litellm/types/test_types_utils.py

+def test_usage_openai_cached_tokens_zero_does_not_set_cache_read():
+    """
+    When OpenAI returns cached_tokens=0, _cache_read_input_tokens should stay 0.
+    """
+    from litellm.types.utils import Usage
+
+    openai_usage = {
+        "prompt_tokens": 100,
+        "completion_tokens": 10,
+        "total_tokens": 110,
+        "prompt_tokens_details": {
+            "audio_tokens": 0,
+            "cached_tokens": 0,
+        },
+    }
+
+    usage = Usage(**openai_usage)
+    assert usage._cache_read_input_tokens == 0


Misleading zero-value test

The implementation maps the internal cache-read counter from prompt_tokens_details.cached_tokens whenever that field is present (checked via cached_tokens is not None), even if the value is 0 (litellm/types/utils.py:1559-1565). This test’s name/docstring reads like zero should be treated as “do not map”, which contradicts the actual behavior and the broader goal of treating “present but zero” differently from “missing”. Please update the test name/docstring and/or assertions to reflect the intended contract: mapping occurs when the field is present; lack of mapping should be reserved for missing details or null cached_tokens.

fix(usage): map OpenAI cached_tokens to _cache_read_input_tokens

36a7c2b

vercel bot deployed to Preview February 10, 2026 16:53 View deployment

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

address review: remove > 0 guard, use distinct values in overwrite test

6780d7f

vercel bot deployed to Preview February 10, 2026 17:08 View deployment

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

test: add DeepSeek, None, and round-trip coverage for cache_read_inpu…

0650ea0

…t_tokens

vercel bot deployed to Preview February 10, 2026 17:28 View deployment

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

		# Should use the explicit Anthropic value (500), not the prompt_tokens_details value (300)
		assert usage._cache_read_input_tokens == 500

Uh oh!

fix(usage): map OpenAI cached_tokens to _cache_read_input_tokens #20878

Are you sure you want to change the base?

fix(usage): map OpenAI cached_tokens to _cache_read_input_tokens #20878

Conversation

michelligabriele commented Feb 10, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

michelligabriele commented Feb 10, 2026

Uh oh!

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

michelligabriele commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Feb 10, 2026 •

edited

Loading

michelligabriele commented Feb 10, 2026 •

edited

Loading