Skip to content

Increase nginx proxy_read_timeout from 300s to 3600s#1

Merged
Evrard-Nil merged 4 commits intomainfrom
fix/increase-proxy-read-timeout
Mar 6, 2026
Merged

Increase nginx proxy_read_timeout from 300s to 3600s#1
Evrard-Nil merged 4 commits intomainfrom
fix/increase-proxy-read-timeout

Conversation

@Evrard-Nil
Copy link
Contributor

Summary

  • Increase proxy_read_timeout from 300s to 3600s across all CVM nginx configs
  • Fixes timeout errors for long-context inference requests (100K+ tokens) where prefill alone exceeds 5 minutes
  • Root cause: nginx closes the SSE stream with upstream timed out (110: Operation timed out) before vLLM finishes processing

Affected configs

  • small-models.yaml (1 occurrence)
  • Qwen3.5-122B.yaml (1 occurrence)
  • GLM-5.yaml (2 occurrences — HTTP + HTTPS)
  • GLM-4.7.yaml (2 occurrences — HTTP + HTTPS)
  • DeepSeek-V3.1.yaml (2 occurrences — HTTP + HTTPS)

Test plan

  • Validated all YAML configs pass pre-commit hook
  • Redeploy nginx on small-models host and verify long-context requests complete
  • Run benchmark with 100K token prompts at high concurrency to confirm no timeouts

Long-context inference requests (100K+ tokens) can take well over 5 minutes
for prefill alone. With the 300s timeout, nginx closes the connection before
vLLM finishes processing, causing "upstream timed out" errors and dropping
in-flight SSE streams. Increase to 3600s across all CVM configs.
Reasoning content intermittently leaks into the content field in streaming
mode when relying on auto-detection. Explicitly enable --enable-reasoning
and --reasoning-parser openai_gptoss per vLLM recommendations.

See: vllm-project/vllm#32125
Keep this PR focused on proxy_read_timeout only. Image update
and reasoning parser will be in separate PRs.
@Evrard-Nil Evrard-Nil merged commit 45c047b into main Mar 6, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant