You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: use actual max_completion_tokens from OpenRouter API (#5240)
- Update parseOpenRouterModel to always use actual max_completion_tokens from OpenRouter API
- Remove artificial restriction that only reasoning budget and Anthropic models get their actual max tokens
- Fall back to 20% of context window when max_completion_tokens is null
- Update getModelMaxOutputTokens to use same fallback logic for consistency
- Update tests to reflect new behavior
- Fixes issue where reserved tokens showed ~209k instead of actual model limits (e.g. GPT-4o: 16,384)
0 commit comments