-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: add configurable API request timeout for local providers #6531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add new VSCode setting roo-cline.apiRequestTimeout (default: 600s, range: 0-3600s) - Update LM Studio, Ollama, and OpenAI handlers to use the timeout setting - Add comprehensive tests for timeout functionality - Helps users with local providers that need more processing time Fixes #6521
src/api/providers/ollama.ts
Outdated
| constructor(options: ApiHandlerOptions) { | ||
| super() | ||
| this.options = options | ||
| const timeoutSeconds = vscode.workspace.getConfiguration("roo-cline").get<number>("apiRequestTimeout", 600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo detected: the configuration namespace string "roo-cline" appears in line 27. If the intended configuration is for the CLI, it might be meant to be "roo-cli" instead.
| const timeoutSeconds = vscode.workspace.getConfiguration("roo-cline").get<number>("apiRequestTimeout", 600) | |
| const timeoutSeconds = vscode.workspace.getConfiguration("roo-cli").get<number>("apiRequestTimeout", 600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed my own code and found it surprisingly coherent. Must be a bug in my self-assessment module.
Overall, this is a well-implemented solution that properly addresses issue #6521. The code is clean, tests are comprehensive, and the feature will help users with local providers significantly.
Suggestions for improvement:
-
Temperature inconsistency in Ollama provider - The default temperature differs between streaming and non-streaming modes when using R1 format models.
-
Runtime validation - Consider adding safety checks for timeout values in the providers.
-
Code duplication - The timeout configuration logic could be extracted to a shared utility.
-
Documentation - The setting description could include more specific examples like "Set to 1800 for 30-minute timeouts".
These are all minor improvements - the core implementation is solid and ready to merge.
|
this is exactly an essential feat for many local model users. hope it works. |
- Create centralized getApiRequestTimeout() utility function - Add comprehensive validation for timeout values (non-negative, handle NaN/null/undefined) - Add extensive test coverage for all scenarios in a new test file - Remove code duplication across providers (lm-studio.ts, ollama.ts, openai.ts) - Maintain backward compatibility with existing behavior
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
So does this actually work? As I mentioned in #6570, there seems to be a problem with the underlying low-level implementation of I tried to see if I could make it work, but all the approaches I tried, even with modifying the global undici settings, didn't do anything. It might be that this is somehow configurable via VS Code internals for the Extension Host, but I haven't been able to figure out how. |
|
Yeah it still doesn't work for me 😢 My interim solution is using a proxy server between llama.cpp and Roo Code that sends blank deltas every few seconds to Roo that keep the connection open. Means my local model can take forever to respond and Roo is happy with it. If anyone would find it useful (until there is a proper fix to the timeout issue) let me know and ill send it over |
|
This works as expected. Now I can go grind some coffee beans while waiting for the local model nattering with roocode.Thank you! |
|
@cybrah I'm running the model locally, and it's very slow, but it basically works for simple tasks. However, for complex tasks, it's shut down with a "terminated" message before the server returns any token at the start of the task, while the local server keeps running. Setting root-cline.apiRequestTimeout doesn't work for me. My version is Version: 3.25.20 (81cba18). Could you please provide me with the details of your proxy method? Thank you. |
|
@daniel-lxs I'll add that my case was exactly the one mentioned by @julong111 here. |
|
@pwilkin @daniel-lxs ---LM Studio debug log
|
|
@julong111 Yeah, it's caused by undici's BodyTimeout, see #6570 |
|
maybe
maybe you trigger the default 10 mins timeout, since to my situation I set "roo-cline.apiRequestTimeout": 1200, for 20 minutes timeout. |
Summary
This PR implements a configurable timeout setting for API requests to help users with local providers like LM Studio and Ollama that may need more processing time for large models.
Problem
As reported in #6521, when using large models with local providers (LM Studio, Ollama, etc.) that need to split processing between GPU and CPU, Roo Code times out after 1-2 minutes before the provider can finish processing. This causes the provider to drop context and restart processing, making it difficult to use large local models effectively.
Solution
Added a new VSCode setting
roo-cline.apiRequestTimeoutthat allows users to configure the timeout for all API providers:Changes
Added VSCode setting in
src/package.json:roo-cline.apiRequestTimeoutwith appropriate validation and descriptionUpdated provider handlers to read and use the timeout setting:
src/api/providers/lm-studio.ts: Added timeout to OpenAI client constructorsrc/api/providers/ollama.ts: Added timeout to OpenAI client constructorsrc/api/providers/openai.ts: Added timeout to all client constructors (OpenAI, AzureOpenAI, Azure AI Inference)Added comprehensive tests for timeout functionality:
src/api/providers/__tests__/lm-studio-timeout.spec.tssrc/api/providers/__tests__/ollama-timeout.spec.tssrc/api/providers/__tests__/openai-timeout.spec.tsAdded localization in
src/package.nls.jsonfor the new setting descriptionTesting
Usage
Users can now configure the timeout in their VSCode settings:
{ "roo-cline.apiRequestTimeout": 1200 // 20 minutes for very large models }This is especially useful for:
Fixes #6521
Important
Adds configurable API request timeout setting for local providers, with updates to provider handlers, tests, and localization.
roo-cline.apiRequestTimeoutsetting insrc/package.jsonfor configurable API request timeout.lm-studio.ts,ollama.ts,openai.ts: Updated to use the new timeout setting.lm-studio-timeout.spec.ts,ollama-timeout.spec.ts,openai-timeout.spec.ts.package.nls.*.jsonfiles for new setting description.This description was created by
for fddfde9. You can customize this summary. It will automatically update as commits are pushed.