Router fails for some prompts and works for others

I configured SR using two models and run benchmark.
Some prompts (from MMLU-Pro) works ok, I can send a request and I am getting an answer, however for some prompts sending request fails and I am getting an error: "no healthy upstream". I see in the envoy log that response code is 503.

**models:**
1) vllm serve microsoft/Phi-3-mini-4k-instruct --port=8002 --host 127.0.0.1 --max-model-len 8192 --enforce-eager --served-model-name phi3
2) vllm serve openai/gpt-oss-20b --port=8003 --host 0.0.0.0 --served-model-name gpt-oss --max-model-len 8192 --enforce-eager

**Failing request for example:**
curl -X POST http://localhost:8801/v1/chat/completions   -H "Content-Type: application/json"   -d '{
    "model": "MoM",
    "messages": [
      {"role": "user", "content": "who are you"}
    ]
  }'

**Working request for example:**
Adding a question mark for the prompt above ("who are you?").

[config.yaml](https://github.com/user-attachments/files/23582410/config.yaml)
[fail_envoy_log.txt](https://github.com/user-attachments/files/23582408/fail_envoy_log.txt)
[fail_router_log.txt](https://github.com/user-attachments/files/23582409/fail_router_log.txt)

How can I debug this?
attaching the logs.

Thank you


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Router fails for some prompts and works for others #683

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Router fails for some prompts and works for others #683

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions