Skip to content

Router fails for some prompts and works for others #683

@shira-g

Description

@shira-g

I configured SR using two models and run benchmark.
Some prompts (from MMLU-Pro) works ok, I can send a request and I am getting an answer, however for some prompts sending request fails and I am getting an error: "no healthy upstream". I see in the envoy log that response code is 503.

models:

  1. vllm serve microsoft/Phi-3-mini-4k-instruct --port=8002 --host 127.0.0.1 --max-model-len 8192 --enforce-eager --served-model-name phi3
  2. vllm serve openai/gpt-oss-20b --port=8003 --host 0.0.0.0 --served-model-name gpt-oss --max-model-len 8192 --enforce-eager

Failing request for example:
curl -X POST http://localhost:8801/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "MoM",
"messages": [
{"role": "user", "content": "who are you"}
]
}'

Working request for example:
Adding a question mark for the prompt above ("who are you?").

config.yaml
fail_envoy_log.txt
fail_router_log.txt

How can I debug this?
attaching the logs.

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions