fix: prevent LM Studio context overflow by limiting maxTokens to 20% of context window #7389

roomote · 2025-08-25T14:21:21Z

Summary

This PR fixes the LM Studio crash issue when the context reaches 128K tokens despite having a 200K limit configured.

Problem

Users were experiencing crashes with LM Studio when the context reached 128K tokens, even though they had configured a 200K context limit. The error message was:

The model has crashed without additional information. (Exit code: 18446744072635812000)

Root Cause

The issue was in the parseLMStudioModel function where maxTokens (maximum output tokens) was being set equal to contextWindow. This meant the model was trying to reserve the entire context window for output, leaving no room for input tokens, causing an overflow.

Solution

Calculate maxTokens as 20% of the contextWindow instead of using the full context size
This ensures there's always sufficient room for input tokens (80% of context)
Follows the same pattern already used by other providers in the codebase

Changes

Updated parseLMStudioModel function to calculate maxTokens = Math.ceil(contextLength * 0.2)
Added comprehensive test coverage for the new calculation logic
Added clear comments explaining the rationale

Testing

All existing tests pass ✅
Added new test cases covering various context window sizes
Linting and type checking pass ✅

Fixes #7388

Important

Fixes LM Studio crash by setting maxTokens to 20% of contextWindow in parseLMStudioModel, ensuring space for input tokens.

Behavior:
- Fixes crash in LM Studio by setting maxTokens to 20% of contextWindow in parseLMStudioModel.
- Ensures input tokens have sufficient space (80% of context).
Testing:
- Adds test cases in lmstudio.test.ts for various contextLength values to verify maxTokens calculation.
- All existing tests pass.
Misc:
- Adds comments in lmstudio.ts explaining maxTokens calculation rationale.

^{This description was created by}^{for 8119af6. You can customize this summary. It will automatically update as commits are pushed.}

- Fix context limit crash issue by properly calculating maxTokens - Previously maxTokens was set equal to contextWindow causing overflow - Now maxTokens is calculated as 20% of contextWindow to leave room for input - Add test coverage for the new calculation logic Fixes #7388

roomote

Reviewing my own code because apparently I trust no one, not even myself.

roomote · 2025-08-25T14:25:44Z

src/api/providers/fetchers/lmstudio.ts

+	// Calculate maxTokens as 20% of context window to prevent context overflow
+	// This ensures there's always room for input tokens and prevents crashes
+	// when approaching the context limit
+	const maxOutputTokens = Math.ceil(contextLength * 0.2)


Could we consider making this ratio configurable? While 20% is a reasonable default that matches other providers, some users might want to adjust this based on their specific use cases. Perhaps a setting like lmstudio.maxOutputRatio with a default of 0.2?

roomote · 2025-08-25T14:25:44Z

src/api/providers/fetchers/lmstudio.ts

+	// Calculate maxTokens as 20% of context window to prevent context overflow
+	// This ensures there's always room for input tokens and prevents crashes
+	// when approaching the context limit
+	const maxOutputTokens = Math.ceil(contextLength * 0.2)


For models with very small context windows (e.g., < 1000 tokens), this 20% calculation might result in very limited output capacity. Should we consider adding a minimum threshold? Something like:

Suggested change

const maxOutputTokens = Math.ceil(contextLength * 0.2)

// Calculate maxTokens as 20% of context window to prevent context overflow

// This ensures there's always room for input tokens and prevents crashes

// when approaching the context limit

const calculatedMaxTokens = Math.ceil(contextLength * 0.2)

// Ensure a minimum of 200 tokens for very small context windows

const maxOutputTokens = Math.max(calculatedMaxTokens, Math.min(200, contextLength))

roomote · 2025-08-25T14:25:44Z

src/api/providers/fetchers/__tests__/lmstudio.test.ts

+				{ contextLength: 8192, expectedMaxTokens: Math.ceil(8192 * 0.2) }, // 1639
+				{ contextLength: 128000, expectedMaxTokens: Math.ceil(128000 * 0.2) }, // 25600
+				{ contextLength: 200000, expectedMaxTokens: Math.ceil(200000 * 0.2) }, // 40000
+			]


Great test coverage! The test cases cover a good range of context sizes. Consider also adding a test case for very small context windows (e.g., 512 tokens) to ensure the calculation works correctly at the lower bounds.

So when to expect this to be merged in next update?

akierum · 2025-08-25T20:42:53Z

So when to expect this to be merged in next update?

roomote bot requested review from cte, jr and mrubens as code owners August 25, 2025 14:21

github-project-automation bot added this to Roo Code Roadmap Aug 25, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Aug 25, 2025

github-project-automation bot added this to Roo Code Roadmap Aug 25, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Aug 25, 2025

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Aug 25, 2025

roomote bot mentioned this pull request Aug 25, 2025

The model has crashed without additional information. (Exit code: 18446744072635812000). when reach 128K context with 200K limit #7388

Closed

roomote bot commented Aug 25, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 25, 2025

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Aug 26, 2025

hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 26, 2025

daniel-lxs closed this Aug 27, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 27, 2025

github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Aug 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent LM Studio context overflow by limiting maxTokens to 20% of context window #7389

fix: prevent LM Studio context overflow by limiting maxTokens to 20% of context window #7389

roomote bot commented Aug 25, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Aug 25, 2025

Uh oh!

roomote bot Aug 25, 2025

Uh oh!

roomote bot Aug 25, 2025

Uh oh!

akierum Aug 25, 2025

Uh oh!

akierum commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fix: prevent LM Studio context overflow by limiting maxTokens to 20% of context window #7389

fix: prevent LM Studio context overflow by limiting maxTokens to 20% of context window #7389

Conversation

roomote bot commented Aug 25, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Root Cause

Solution

Changes

Testing

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

akierum Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

akierum commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

roomote bot commented Aug 25, 2025 •

edited by ellipsis-dev bot

Loading