fix(openai): non-compatible providers send correct max_tokens field by peteski22 · Pull Request #57 · mozilla-ai/any-llm-go

peteski22 · 2026-02-26T21:01:56Z

Summary

DeepSeek and Mistral delegate to CompatibleProvider.Completion() which maps MaxTokens → max_completion_tokens on the wire. Both APIs actually expect max_tokens, meaning the token limit was silently dropped.
Adds ChatCompletionRequestTransform hook to CompatibleConfig (Strategy pattern) so providers can adjust the SDK request after convertParams builds it.
DeepSeek and Mistral supply transforms that swap max_completion_tokens → max_tokens and clear unsupported fields.
Moves Mistral's user/reasoning_effort stripping from preprocessParams to transformRequest where it operates on the actual wire request.

Context

Inspired by any-llm#865 which addressed the same max_tokens vs max_completion_tokens issue in the Python port. The Go approach differs: rather than changing the base compatible provider, we keep max_completion_tokens as the correct default (per OpenAI spec) and let non-compatible providers opt into transforming their requests.

Test plan

Wire-level tests using httptest.NewServer capture actual JSON bodies and assert field names
OpenAI sends max_completion_tokens (not max_tokens)
DeepSeek sends max_tokens (not max_completion_tokens)
Mistral sends max_tokens (not max_completion_tokens)
Mistral strips user and reasoning_effort from wire requests
Full test suite passes with -race

…field DeepSeek and Mistral are not fully OpenAI-compatible but delegated to CompatibleProvider.Completion(), which unconditionally mapped MaxTokens to max_completion_tokens on the wire. Both APIs expect max_tokens. Add ChatCompletionRequestTransform hook to CompatibleConfig (Strategy pattern) that lets providers adjust the SDK request after convertParams builds it. DeepSeek and Mistral supply transforms that swap max_completion_tokens back to max_tokens and clear unsupported fields. Move Mistral's user/reasoning_effort stripping from preprocessParams (CompletionParams level) to transformRequest (SDK request level) where it correctly prevents the fields from being serialized. Add FakeCompletionServer test helper and wire-level tests that capture actual JSON request bodies to assert correct field names on the wire.

peteski22 mentioned this pull request Feb 26, 2026

fix: OpenAI-compatible providers inconsistently handle max_tokens vs max_completion_tokens mozilla-ai/any-llm#867

Closed

peteski22 force-pushed the fix/max_tokens branch from 0e9c2d9 to 4b59f4f Compare February 26, 2026 21:24

peteski22 force-pushed the fix/max_tokens branch from 4b59f4f to 6aa62c7 Compare February 26, 2026 21:38

peteski22 requested review from agpituk and javiermtorres February 26, 2026 21:39

javiermtorres approved these changes Feb 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(openai): non-compatible providers send correct max_tokens field#57

fix(openai): non-compatible providers send correct max_tokens field#57
peteski22 wants to merge 1 commit intomainfrom
fix/max_tokens

peteski22 commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

peteski22 commented Feb 26, 2026

Summary

Context

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants