fix: token limits & Invalid JSON Response Errors #1934
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
π§ Fix Token Limits & Invalid JSON Response Errors
Issues Resolved
Root Causes Identified
π― Solutions Implemented
1. Accurate Token Limits & Context Sizes
Updated all providers with their actual context window capabilities:
2. Dynamic Model Intelligence
3. Enhanced Error Handling
4. Performance Optimizations
π Files Modified
app/lib/.server/llm/constants.ts- Updated MAX_TOKENS from 8k to 32kapp/lib/modules/llm/providers/openai.ts- GPT models with accurate 128k/16k limitsapp/lib/modules/llm/providers/anthropic.ts- Claude models with 200k contextapp/lib/modules/llm/providers/google.ts- Gemini models with 1M-2M contextapp/lib/modules/llm/providers/groq.ts- Llama models with 128k contextapp/lib/modules/llm/providers/together.ts- Updated model configurationsapp/lib/modules/llm/providers/open-router.ts- Enhanced context detectionapp/lib/.server/llm/stream-text.ts- Token validation and safety capsapp/routes/api.chat.ts- Comprehensive error handling improvementsβ Verification
π Impact & Benefits
For Users:
For Developers:
For System Performance:
π Technical Details
Token Limit Strategy
Error Classification
Dynamic Context Detection
context_lengthfrom/v1/modelsAPImax_tokensfrom/v1/modelsAPIinputTokenLimitfrom Generative AI APIcontext_lengthfrom aggregated modelsπ Deployment Notes
Fixes: #1917
Type: Bug Fix, Enhancement
Priority: High
Breaking: No