Skip to content

fix(provider/xai): handle inconsistent cached token reporting#12485

Merged
dancer merged 7 commits intomainfrom
josh/fix-xai-cached-token-reporting
Feb 12, 2026
Merged

fix(provider/xai): handle inconsistent cached token reporting#12485
dancer merged 7 commits intomainfrom
josh/fix-xai-cached-token-reporting

Conversation

@dancer
Copy link
Copy Markdown
Collaborator

@dancer dancer commented Feb 12, 2026

background

xAI's token reporting is inconsistent across models. most models report prompt_tokens/input_tokens inclusive of cached tokens (like OpenAI), but some models (e.g. grok-4-1-fast-non-reasoning) report them exclusive of cached tokens, where cached_tokens > prompt_tokens

summary

  • detect which reporting style xAI is using based on whether cached_tokens <= prompt_tokens
  • when inclusive (normal): subtract cached from prompt to get noCache (OpenAI pattern)
  • when exclusive (anomalous): prompt tokens already represent noCache, add cached for total (Anthropic pattern)
  • applies to both chat completions and responses APIs
  • add unit tests for the non-inclusive reporting edge case
  • add responses usage test file

verification

gateway bug case (cached > prompt)
before: total=4142, noCache=-186, cacheRead=4328
after:  total=8470, noCache=4142, cacheRead=4328
normal case (cached <= prompt)
raw:   input_tokens: 12, cached_tokens: 3
sdk:   noCache: 9, cacheRead: 3, total: 12

checklist

  • tests have been added / updated (for bug fixes / features)
  • documentation has been added / updated (for bug fixes / features)
  • a patch changeset for relevant packages has been added (run pnpm changeset in root)
  • i have reviewed this pull request (self-review)

@vercel-ai-sdk vercel-ai-sdk bot added ai/provider related to a provider package. Must be assigned together with at least one `provider/*` label provider/community labels Feb 12, 2026
@gr2m gr2m added provider/xai Issues related to the @ai-sdk/xai provider and removed provider/community labels Feb 12, 2026
@dancer dancer added the backport Admins only: add this label to a pull request in order to backport it to the prior version label Feb 12, 2026
@dancer dancer merged commit 7ccb902 into main Feb 12, 2026
36 checks passed
@dancer dancer deleted the josh/fix-xai-cached-token-reporting branch February 12, 2026 22:51
vercel-ai-sdk bot pushed a commit that referenced this pull request Feb 12, 2026
@vercel-ai-sdk vercel-ai-sdk bot removed the backport Admins only: add this label to a pull request in order to backport it to the prior version label Feb 12, 2026
@vercel-ai-sdk
Copy link
Copy Markdown
Contributor

vercel-ai-sdk bot commented Feb 12, 2026

⚠️ Backport to release-v5.0 created but has conflicts: #12518

dancer added a commit that referenced this pull request Feb 12, 2026
…ng (#12518)

## background

backport of #12485 to `release-v5.0`

xAI's token reporting is inconsistent across models. most models report
`prompt_tokens`/`input_tokens` inclusive of cached tokens (like OpenAI),
but some models (e.g. `grok-4-1-fast-non-reasoning`) report them
exclusive of cached tokens, where `cached_tokens > prompt_tokens`

## summary

- add `convertXaiChatUsage` and `convertXaiResponsesUsage` converter
functions
- detect which reporting style xAI is using based on whether
`cached_tokens <= prompt_tokens`
- when inclusive (normal): use prompt tokens as-is
- when exclusive (anomalous): add cached tokens to prompt for total
input tokens
- applies to both chat completions and responses APIs
- adapted for v5 `LanguageModelV2Usage` flat format (vs v6 structured
format)

## verification

<details>
<summary>tests</summary>

```
 ✓ src/convert-xai-chat-usage.test.ts  (6 tests) 6ms
 ✓ src/responses/convert-xai-responses-usage.test.ts  (6 tests) 6ms

 Test Files  2 passed (2)
      Tests  12 passed (12)
```

</details>

## checklist

- [x] tests have been added / updated (for bug fixes / features)
- [ ] documentation has been added / updated (for bug fixes / features)
- [x] a _patch_ changeset for relevant packages has been added (run
`pnpm changeset` in root)
- [x] i have reviewed this pull request (self-review)

## related issues

backport of #12485

---------

Co-authored-by: josh <josh@afterima.ge>
gr2m pushed a commit that referenced this pull request Feb 16, 2026
## background

xAI's token reporting is inconsistent across models. most models report
`prompt_tokens`/`input_tokens` inclusive of cached tokens (like OpenAI),
but some models (e.g. `grok-4-1-fast-non-reasoning`) report them
exclusive of cached tokens, where `cached_tokens > prompt_tokens`

## summary

- detect which reporting style xAI is using based on whether
`cached_tokens <= prompt_tokens`
- when inclusive (normal): subtract cached from prompt to get noCache
(OpenAI pattern)
- when exclusive (anomalous): prompt tokens already represent noCache,
add cached for total (Anthropic pattern)
- applies to both chat completions and responses APIs
- add unit tests for the non-inclusive reporting edge case
- add responses usage test file

## verification

<details>
<summary>gateway bug case (cached > prompt)</summary>

```
before: total=4142, noCache=-186, cacheRead=4328
after:  total=8470, noCache=4142, cacheRead=4328
```

</details>

<details>
<summary>normal case (cached <= prompt)</summary>

```
raw:   input_tokens: 12, cached_tokens: 3
sdk:   noCache: 9, cacheRead: 3, total: 12
```

</details>

## checklist

- [x] tests have been added / updated (for bug fixes / features)
- [ ] documentation has been added / updated (for bug fixes / features)
- [x] a _patch_ changeset for relevant packages has been added (run
`pnpm changeset` in root)
- [x] i have reviewed this pull request (self-review)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai/provider related to a provider package. Must be assigned together with at least one `provider/*` label provider/xai Issues related to the @ai-sdk/xai provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants