Increase nginx proxy_read_timeout from 300s to 3600s#1
Merged
Evrard-Nil merged 4 commits intomainfrom Mar 6, 2026
Merged
Conversation
Long-context inference requests (100K+ tokens) can take well over 5 minutes for prefill alone. With the 300s timeout, nginx closes the connection before vLLM finishes processing, causing "upstream timed out" errors and dropping in-flight SSE streams. Increase to 3600s across all CVM configs.
Reasoning content intermittently leaks into the content field in streaming mode when relying on auto-detection. Explicitly enable --enable-reasoning and --reasoning-parser openai_gptoss per vLLM recommendations. See: vllm-project/vllm#32125
Keep this PR focused on proxy_read_timeout only. Image update and reasoning parser will be in separate PRs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
proxy_read_timeoutfrom 300s to 3600s across all CVM nginx configsupstream timed out (110: Operation timed out)before vLLM finishes processingAffected configs
small-models.yaml(1 occurrence)Qwen3.5-122B.yaml(1 occurrence)GLM-5.yaml(2 occurrences — HTTP + HTTPS)GLM-4.7.yaml(2 occurrences — HTTP + HTTPS)DeepSeek-V3.1.yaml(2 occurrences — HTTP + HTTPS)Test plan