fix(provider/xai): handle inconsistent cached token reporting by dancer · Pull Request #12485 · vercel/ai

dancer · 2026-02-12T15:23:36Z

background

xAI's token reporting is inconsistent across models. most models report prompt_tokens/input_tokens inclusive of cached tokens (like OpenAI), but some models (e.g. grok-4-1-fast-non-reasoning) report them exclusive of cached tokens, where cached_tokens > prompt_tokens

summary

detect which reporting style xAI is using based on whether cached_tokens <= prompt_tokens
when inclusive (normal): subtract cached from prompt to get noCache (OpenAI pattern)
when exclusive (anomalous): prompt tokens already represent noCache, add cached for total (Anthropic pattern)
applies to both chat completions and responses APIs
add unit tests for the non-inclusive reporting edge case
add responses usage test file

verification

gateway bug case (cached > prompt)

before: total=4142, noCache=-186, cacheRead=4328
after:  total=8470, noCache=4142, cacheRead=4328

normal case (cached <= prompt)

raw:   input_tokens: 12, cached_tokens: 3
sdk:   noCache: 9, cacheRead: 3, total: 12

checklist

tests have been added / updated (for bug fixes / features)
documentation has been added / updated (for bug fixes / features)
a patch changeset for relevant packages has been added (run pnpm changeset in root)
i have reviewed this pull request (self-review)

…onses

vercel-ai-sdk · 2026-02-12T22:52:22Z

⚠️ Backport to release-v5.0 created but has conflicts: #12518

…ng (#12518) ## background backport of #12485 to `release-v5.0` xAI's token reporting is inconsistent across models. most models report `prompt_tokens`/`input_tokens` inclusive of cached tokens (like OpenAI), but some models (e.g. `grok-4-1-fast-non-reasoning`) report them exclusive of cached tokens, where `cached_tokens > prompt_tokens` ## summary - add `convertXaiChatUsage` and `convertXaiResponsesUsage` converter functions - detect which reporting style xAI is using based on whether `cached_tokens <= prompt_tokens` - when inclusive (normal): use prompt tokens as-is - when exclusive (anomalous): add cached tokens to prompt for total input tokens - applies to both chat completions and responses APIs - adapted for v5 `LanguageModelV2Usage` flat format (vs v6 structured format) ## verification <details> <summary>tests</summary> ``` ✓ src/convert-xai-chat-usage.test.ts (6 tests) 6ms ✓ src/responses/convert-xai-responses-usage.test.ts (6 tests) 6ms Test Files 2 passed (2) Tests 12 passed (12) ``` </details> ## checklist - [x] tests have been added / updated (for bug fixes / features) - [ ] documentation has been added / updated (for bug fixes / features) - [x] a _patch_ changeset for relevant packages has been added (run `pnpm changeset` in root) - [x] i have reviewed this pull request (self-review) ## related issues backport of #12485 --------- Co-authored-by: josh <josh@afterima.ge>

## background xAI's token reporting is inconsistent across models. most models report `prompt_tokens`/`input_tokens` inclusive of cached tokens (like OpenAI), but some models (e.g. `grok-4-1-fast-non-reasoning`) report them exclusive of cached tokens, where `cached_tokens > prompt_tokens` ## summary - detect which reporting style xAI is using based on whether `cached_tokens <= prompt_tokens` - when inclusive (normal): subtract cached from prompt to get noCache (OpenAI pattern) - when exclusive (anomalous): prompt tokens already represent noCache, add cached for total (Anthropic pattern) - applies to both chat completions and responses APIs - add unit tests for the non-inclusive reporting edge case - add responses usage test file ## verification <details> <summary>gateway bug case (cached > prompt)</summary> ``` before: total=4142, noCache=-186, cacheRead=4328 after: total=8470, noCache=4142, cacheRead=4328 ``` </details> <details> <summary>normal case (cached <= prompt)</summary> ``` raw: input_tokens: 12, cached_tokens: 3 sdk: noCache: 9, cacheRead: 3, total: 12 ``` </details> ## checklist - [x] tests have been added / updated (for bug fixes / features) - [ ] documentation has been added / updated (for bug fixes / features) - [x] a _patch_ changeset for relevant packages has been added (run `pnpm changeset` in root) - [x] i have reviewed this pull request (self-review)

dancer added 7 commits February 12, 2026 15:22

fix(provider/xai): handle inconsistent cached token reporting in chat

a0c1c9a

test(provider/xai): add test for non-inclusive cached token reporting

c428375

fix(provider/xai): handle inconsistent cached token reporting in resp…

c6878bd

…onses

test(provider/xai): add responses usage conversion tests

f662a7a

feat(examples): add xai responses usage examples for generateText

bdc9367

feat(examples): add xai responses usage examples for streamText

07936ae

chore: add changeset

16198db

github-actions bot assigned dancer Feb 12, 2026

vercel-ai-sdk bot added ai/provider related to a provider package. Must be assigned together with at least one `provider/*` label provider/community labels Feb 12, 2026

vercel bot deployed to Preview February 12, 2026 15:24 View deployment

gr2m added provider/xai Issues related to the @ai-sdk/xai provider and removed provider/community labels Feb 12, 2026

gr2m approved these changes Feb 12, 2026

View reviewed changes

dancer added the backport Admins only: add this label to a pull request in order to backport it to the prior version label Feb 12, 2026

dancer merged commit 7ccb902 into main Feb 12, 2026
36 checks passed

dancer deleted the josh/fix-xai-cached-token-reporting branch February 12, 2026 22:51

vercel-ai-sdk bot pushed a commit that referenced this pull request Feb 12, 2026

Backport conflicts for PR #12485 to release-v5.0

fafadd9

vercel-ai-sdk bot mentioned this pull request Feb 12, 2026

Backport: fix(provider/xai): handle inconsistent cached token reporting #12518

Merged

4 tasks

vercel-ai-sdk bot removed the backport Admins only: add this label to a pull request in order to backport it to the prior version label Feb 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(provider/xai): handle inconsistent cached token reporting#12485

fix(provider/xai): handle inconsistent cached token reporting#12485
dancer merged 7 commits intomainfrom
josh/fix-xai-cached-token-reporting

dancer commented Feb 12, 2026

Uh oh!

Uh oh!

vercel-ai-sdk bot commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dancer commented Feb 12, 2026

background

summary

verification

checklist

Uh oh!

Uh oh!

vercel-ai-sdk bot commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants