-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Bug Title
codex disconnected to vllm server during tool call and did not process the server respond
Description
You can see the explain code being drop even we see the respond from vllm (10-03 18:57:08), open-responses-server does process the request at 2025-10-03 18:57:00,971. not sure what happen.
Steps to Reproduce
setup vllm v0.10.2 (window/linux) and run with this command:
vllm serve openai/gpt-oss-20b --max-model-len 130k --served-model-name gpt-oss-20b
start oct server (default vllm port is 8000 and oct default external port to 8080) :
oct start
setup codex with this configuration setting:
[model_providers.vllm]
name = "VLLM"
base_url = "http://localhost:8080/v1"
[profiles.gpt-oss-20b-vllm]
model_provider = "vllm"
model = "gpt-oss-20b"
start codex with this command:
codex --profile gpt-oss-20b-vllm
start in any code area, and ask codex to explain code (example):
explain COBSWebSerial.py
some time, it will process the request without dropping tool call, but most of the time it will. I am not sure how to debug, please advice.
Expected Behavior
we expect codex to wait for respond from vllm server, not just jump to next prompt.
Actual Behavior
codex did not wait for for the tool call to finish, but it just jump to next prompt input.