-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: match timeout behaviour with docs #7707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and found some issues that need attention before this can be merged.
|
Important: The test file You may also want to add tests to verify that providers correctly transform 0 to 2147483647 for better test coverage of this new behavior. |
|
on second thought, there's no reason not to do that, brb will update it |
|
@hannesrudolph was this closed on purpose? |
Yeah it's stale. You said you were going today and didn't so I assumed you abandoned it. |
oh sorry for the ambiguity / lack of clarification, per my last commit this PR is actually ready to review / merge :) |
|
Then reopen it? |
|
It seems like something broke every test that imports the |
34480f2 to
683db38
Compare
683db38 to
60eaf89
Compare
|
I've fixed the tests. Two things:
|
|
Why is this still not merged? |
|
Please merge this, this bug is not a minor inconvenience but a use case blocker. |
Exactly. This makes the entire software unusable for local models over a BUG. Nobody running SOTA models locally will be able to keep prompt processing under 5 minutes at max context. |
|
The caching in LM Studio (llama.cpp) at least allows the big prompts to eventually compute in under 5 minutes, but after multiple retries, so I maxed out the allowed retries and time between them, but this is an ugly workaround and not very reliable. |
|
This is exactly what's happening for me as well. Using exl3 and eventually after 5-10 retries enough has cached to take under 5 minutes. But this is really stupid behaviour and 10-folds the time wasted for every single request. |
I found the fix. Switch to Kilo Code and let this thing rot. |
Kilo Code has the exact same issue, so what fix did you find? |
Isn't that issue closed there? That means it used to have the same exact issue, not anymore. I can also confirm it - I enter 86400 in Kilo and it waits hours and hours, definitely something got fixed. |
Despite the issue being closed, I just confirmed that it's not fixed. I set the API timeout to 86400 seconds in Kilo, generated a 44.6k token conversation, then changed the mode (from Architect to Ask) to force LLM (i'm running unsloth's deepseek-v3.1-terminus q2_k_xl) to reread the conversation in a single request, and it still failed at the 5 minute mark. Version is 4.119.2. It's a bit silly how widespread this bug is - n8n has it too. Anyway, Belerafon suggested a fix for Roo Code here: #7366 (comment), which I've tested. Hopefully Roo Code devs properly patch this soon. |
It definitely works for me, check this out:
Yes, this bug is so bothersome because it makes so much hardware useless and I also happen to own some of that shit hardware. Imagine if this worked, every potato PC could be repurposed as full-time agent working on stuff. |

Related GitHub Issue
Closes: #7366
Roo Code Task Context (Optional)
Description
This PR fixes timeout to match the docs. This PR makes it so that 0 actually disables the timeout by setting it to max 32 int value (2^31 - 1).
I wonder if I should change thegetApiRequestTimeoutinstead to return that value? I keep all of it at the providers right now. Tell me if that is more preferable than the current implementation.Test Procedure
Pre-Submission Checklist
Screenshots / Videos
Documentation Updates
Additional Notes
Get in Touch
@elianiva
Important
Fix timeout behavior to match documentation by setting a timeout of 0 to the maximum safe integer value, applied across multiple API providers.
getApiRequestTimeout()intimeout-config.tsto return2147483647for a timeout of0, effectively disabling the timeout.anthropic.ts,base-openai-compatible-provider.ts,bedrock.ts,cerebras.ts,huggingface.ts,lm-studio.ts,ollama.ts,openai-native.ts,openrouter.ts,qwen-code.ts,requesty.ts,router-provider.ts, andxai.ts.timeout-config.spec.tsto test new behavior for zero and negative timeout values, ensuring they return the safe maximum value.timeout-config.tsfor clarity and consistency.This description was created by
for 83aebe4. You can customize this summary. It will automatically update as commits are pushed.