feat: Modern Dashboard MVP #388

JaredforReal · 2025-10-10T09:41:43Z

What type of PR is this?
feat: Modern Dashboard MVP

What this PR does / why we need it:

React + Typescript SPA Frontend to provide Modern UI and Dark/Light theme.
Using iframe to integrate Grafana data panels and OpenWeb UI Playground.
make docker-compose-up to start full stack(semantic-router + envoy + grafana + prometheus + dashboard + openwebui).
Config Viewer MVP

Which issue(s) this PR fixes:
Fixes #325

Current Progress: Not functional yet.

netlify · 2025-10-10T09:41:48Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`bdae576`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68ea3baf2aee1e0008a7a50a
😎 Deploy Preview	https://deploy-preview-388--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-10-10T09:41:54Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

dashboard/.dockerignore
dashboard/README.md
dashboard/backend/.gitkeep
dashboard/backend/Dockerfile
dashboard/backend/go.mod
dashboard/backend/go.sum
dashboard/backend/main.go
dashboard/deploy/kubernetes/.gitkeep
dashboard/deploy/kubernetes/deployment.yaml
dashboard/deploy/local/.gitkeep
dashboard/frontend/index.html
dashboard/frontend/package-lock.json
dashboard/frontend/package.json
dashboard/frontend/public/vllm.png
dashboard/frontend/src/App.tsx
dashboard/frontend/src/components/ConfigNav.module.css
dashboard/frontend/src/components/ConfigNav.tsx
dashboard/frontend/src/components/Layout.module.css
dashboard/frontend/src/components/Layout.tsx
dashboard/frontend/src/index.css
dashboard/frontend/src/main.tsx
dashboard/frontend/src/pages/ConfigPage.module.css
dashboard/frontend/src/pages/ConfigPage.tsx
dashboard/frontend/src/pages/MonitoringPage.module.css
dashboard/frontend/src/pages/MonitoringPage.tsx
dashboard/frontend/src/pages/PlaygroundPage.module.css
dashboard/frontend/src/pages/PlaygroundPage.tsx
dashboard/frontend/src/vite-env.d.ts
dashboard/frontend/tsconfig.json
dashboard/frontend/tsconfig.node.json
dashboard/frontend/vite.config.ts
.gitignore
Dockerfile.extproc

📁 `deploy`

Owners: @rootfs, @Xunzhuo
Files changed:

deploy/docker-compose/docker-compose.yml

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/make/linter.mk
tools/openwebui-pipe/vllm_semantic_router_pipe.py

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

JaredforReal · 2025-10-10T09:45:09Z

Focusing on the o11y stack for Local and Docker Compose paths at the moment. This still needs more iterations, so I’ll keep it as a draft for now. Apologies for the complexity and the number of files affected — I’m working hard to refine it and make the solution as elegant as possible. I’d really appreciate any suggestions or feedback from the community!

Xunzhuo · 2025-10-10T11:24:35Z

coooooool! i think maybe first of the priority is to manage the install and configuration of vLLM-SR easily? And then observability like embedded the grafana dashboard and jaeger tracing.

JaredforReal · 2025-10-10T11:37:52Z

coooooool! i think maybe first of the priority is to manage the install and configuration of vLLM-SR easily? And then observability like embedded the grafana dashboard and jaeger tracing.

@Xunzhuo Thanks! Agreed — install and config are key. I’ve worked on observability for a while, made it easier with a quick MVP dashboard. Config in particular is complex and requires more careful iterations, but I’ll keep working on it.

Xunzhuo · 2025-10-10T11:40:46Z

Cool, i need to point out that the dashboard for vLLM-SR is not something like Grafana Dashboard, the goal is to build the admin console for managing the vLLM-SR.

Xunzhuo · 2025-10-10T11:41:20Z

but no worries, keep moving forward, nice work!

JaredforReal · 2025-10-10T11:43:15Z

Got u!

Xunzhuo · 2025-10-10T11:47:13Z

take this as an example : )

url: https://www.demo.litellm.ai/ui
user: admin
pwd: sk-1234

JaredforReal · 2025-10-10T11:52:41Z

take this as an example : )

url: https://www.demo.litellm.ai/ui user: admin pwd: sk-1234

inspiring! This will completely change the UX

Xunzhuo · 2025-10-10T11:59:44Z

Yep, keep doing your magics 🪄

Signed-off-by: JaredforReal <[email protected]>

Xunzhuo · 2025-10-11T05:27:39Z

can u share screenshot

JaredforReal · 2025-10-11T05:28:23Z

@Xunzhuo, I have a little problem using openwebui-pipe in Dashboard. Can't find a + or import right here.

JaredforReal · 2025-10-11T05:29:05Z

can u share screenshot

The screenshot in PR description is updated.

Xunzhuo · 2025-10-11T05:35:41Z

u need to install openweb ui pipline and add pipeline address

JaredforReal · 2025-10-11T05:37:26Z

will work on it

Xunzhuo · 2025-10-11T09:19:38Z

can you check the log of semantic router?

JaredforReal · 2025-10-11T09:25:04Z

(base) jared@Jared:~/vllm-project/semantic-router$ curl -v http://localhost:11434/v1/chat/completions   -H "Content-Type: application/jso
n"   -d '{
    "messages": [{"role": "user", "content": "Hi! What is 2 + 2?"}],
    "stream": false                                                 
  }'
* Uses proxy env variable no_proxy == '172.31.*,172.30.*,172.29.*,172.28.*,172.27.*,172.26.*,172.25.*,172.24.*,172.23.*,172.22.*,172.21.*,172.20.*,172.19.*,172.18.*,172.17.*,172.16.*,10.*,192.168.*,127.*,localhost,<local>'
*   Trying 127.0.0.1:11434...
* Connected to localhost (127.0.0.1) port 11434 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:11434
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< date: Sat, 11 Oct 2025 08:46:49 GMT
< server: uvicorn
< content-length: 597
< content-type: application/json
< 
* Connection #0 to host localhost left intact
{"id":"chatcmpl-f0db6df8b3de4bad92f80f38a81a4a9d","object":"chat.completion","created":1760172414,"model":"phi4","choices":[{"index":0,"message":{"role":"assistant","content":"2 + 2 equals 4.","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":17,"total_tokens":26,"completion_tokens":9,"prompt_tokens_details":null},"prompt_logprobs":null,"

(base) jared@Jared:~/vllm-project/semantic-router$ docker exec -it semantic-router sh -c 'curl -sS http://172.17.0.1:11434/health'                            docker exec -it semantic-router sh -c 'curl -sS http://172.17.0.1:11434/health'
curl: (7) Failed to connect to 172.17.0.1 port 11434: Connection refused

(base) jared@Jared:~/vllm-project/semantic-router$ curl -v http://localhost:8801/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"Hi! What is 2 + 2?"}],"stream":false}'
* Uses proxy env variable no_proxy == '172.31.*,172.30.*,172.29.*,172.28.*,172.27.*,172.26.*,172.25.*,172.24.*,172.23.*,172.22.*,172.21.*,172.20.*,172.19.*,172.18.*,172.17.*,172.16.*,10.*,192.168.*,127.*,localhost,<local>'
*   Trying 127.0.0.1:8801...
* Connected to localhost (127.0.0.1) port 8801 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8801
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 91
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 503 Service Unavailable
< content-length: 167
< content-type: text/plain
< date: Sat, 11 Oct 2025 08:47:39 GMT
< server: envoy
< 
* Connection #0 to host localhost left intact
upstream connect error or disconnect/reset before headers. reset reason: remote connection failure, transport failure reason: delayed con

(base) jared@Jared:~/vllm-project/semantic-router$ docker logs semantic-router 2>&1 | tail -30
{"level":"info","ts":"2025-10-11T08:40:03.065611652Z","caller":"observability/logging.go:140","msg":"Starting insecure LLM Router ExtProc server on port 50051..."}
{"level":"info","ts":"2025-10-11T08:40:03.065725968Z","caller":"observability/logging.go:140","msg":"Found global classification service on attempt 1/5"}
{"level":"info","ts":"2025-10-11T08:40:03.066670389Z","caller":"observability/logging.go:140","msg":"System prompt configuration endpoints disabled for security"}
{"level":"info","ts":"2025-10-11T08:40:03.066646981Z","caller":"observability/logging.go:136","msg":"config_watcher_error","stage":"create_watcher","error":"too many open files","event":"config_watcher_error"}
{"level":"info","ts":"2025-10-11T08:40:03.066725679Z","caller":"observability/logging.go:140","msg":"Classification API server listening on port 8080"}
{"level":"info","ts":"2025-10-11T08:40:24.532769936Z","caller":"observability/logging.go:140","msg":"Started processing a new request"}
{"level":"info","ts":"2025-10-11T08:40:24.536033144Z","caller":"observability/logging.go:140","msg":"Received request headers"}
{"level":"info","ts":"2025-10-11T08:40:24.537751317Z","caller":"observability/logging.go:140","msg":"Received request body {\n    \"model\": \"auto\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Hi! What is 2 + 2?\"}],\n    \"stream\": false\n  }"}
{"level":"info","ts":"2025-10-11T08:40:24.538240988Z","caller":"observability/logging.go:140","msg":"Original model: auto"}
{"level":"info","ts":"2025-10-11T08:40:24.83565221Z","caller":"observability/logging.go:140","msg":"Jailbreak classification result: {0 0.9999995}"}
{"level":"info","ts":"2025-10-11T08:40:24.835747565Z","caller":"observability/logging.go:140","msg":"BENIGN: 'benign' (confidence: 1.000, threshold: 0.700)"}
{"level":"info","ts":"2025-10-11T08:40:24.835761373Z","caller":"observability/logging.go:140","msg":"No jailbreak detected in request content"}
{"level":"info","ts":"2025-10-11T08:40:25.402714953Z","caller":"observability/logging.go:140","msg":"Using Auto Model Selection"}
{"level":"info","ts":"2025-10-11T08:40:25.715911914Z","caller":"observability/logging.go:140","msg":"Classification result: class=9, confidence=0.9579"}
{"level":"info","ts":"2025-10-11T08:40:25.715991793Z","caller":"observability/logging.go:140","msg":"Classified as category: math (mmlu=math)"}
{"level":"info","ts":"2025-10-11T08:40:25.716009374Z","caller":"observability/logging.go:140","msg":"Selected model phi4 for category math with score 0.6000"}
{"level":"info","ts":"2025-10-11T08:40:25.988612972Z","caller":"observability/logging.go:140","msg":"Classification result: class=9, confidence=0.9579"}
{"level":"info","ts":"2025-10-11T08:40:25.98871329Z","caller":"observability/logging.go:140","msg":"Classified as category: math (mmlu=math)"}
{"level":"info","ts":"2025-10-11T08:40:25.988725742Z","caller":"observability/logging.go:140","msg":"No PII policy found for model phi4, allowing request"}
{"level":"info","ts":"2025-10-11T08:40:25.988733235Z","caller":"observability/logging.go:140","msg":"Routing to model: phi4"}
{"level":"info","ts":"2025-10-11T08:40:26.234778194Z","caller":"observability/logging.go:140","msg":"Classification result: class=9, confidence=0.9579, entropy_available=true"}
{"level":"info","ts":"2025-10-11T08:40:26.234937043Z","caller":"observability/logging.go:140","msg":"Classified as category: math (mmlu=math), reasoning_decision: use=true, confidence=0.910, reason=very_low_uncertainty_trust_classification"}
{"level":"info","ts":"2025-10-11T08:40:26.234957925Z","caller":"observability/logging.go:140","msg":"Entropy-based reasoning decision: category='math', confidence=0.958, use_reasoning=true, reason=very_low_uncertainty_trust_classification, strategy=trust_top_category"}
{"level":"info","ts":"2025-10-11T08:40:26.234977547Z","caller":"observability/logging.go:140","msg":"Top predicted categories: [{math 0.9578654} {chemistry 0.011327983} {psychology 0.007727393}]"}
{"level":"info","ts":"2025-10-11T08:40:26.234989112Z","caller":"observability/logging.go:140","msg":"Entropy-based reasoning decision for this query: true on [phi4] model (confidence: 0.910, reason: very_low_uncertainty_trust_classification)"}
{"level":"info","ts":"2025-10-11T08:40:26.235015167Z","caller":"observability/logging.go:140","msg":"Selected endpoint address: 172.17.0.1:11434 for model: phi4"}
{"level":"info","ts":"2025-10-11T08:40:26.238834164Z","caller":"observability/logging.go:140","msg":"No reasoning support for model: phi4 (no reasoning family configured)"}
{"level":"info","ts":"2025-10-11T08:40:26.238939029Z","caller":"observability/logging.go:140","msg":"Use new model: phi4"}
{"level":"info","ts":"2025-10-11T08:40:26.239001262Z","caller":"observability/logging.go:136","msg":"routing_decision","reasoning_effort":"high","event":"routing_decision","request_id":"4bc3c450-065b-4cbe-bc21-9888ea6bb84f","selected_model":"phi4","category":"math","selected_endpoint":"172.17.0.1:11434","routing_latency_ms":1701,"reason_code":"auto_routing","original_model":"auto","reasoning_enabled":true}
{"level":"info","ts":"2025-10-11T08:40:26.24164372Z","caller":"observability/logging.go:140","msg":"Stream ended gracefully"}

I can access the vLLM port directly, but I cannot do so through semantic-router in Docker Compose.
I'm using phi4 and Qwen3-0.6B, is it a reasoning family problem?

Xunzhuo · 2025-10-11T09:36:56Z

nope, maybe it should be docker network issue, like you debugging, you cannot access the 172.17.0.1:11434, make sure the docker network is configured approperly

Xunzhuo · 2025-10-11T09:38:54Z

try to make sure the container can access the external ip, like doing manually curl, when it is passed, the envoy should be accessed that as well

Signed-off-by: JaredforReal <[email protected]>

JaredforReal · 2025-10-11T10:12:04Z

Configuration Viewer Implemented. The MVP version based on Docker Compose is roughly finished. I'll update troubleshooting when I figure out the docker network problem

Xunzhuo · 2025-10-11T10:21:00Z

wonderful!

JaredforReal · 2025-10-11T10:33:59Z

It can be merged now. I will create a follow-up issue to ask for suggestions, show what I am struggling with, and discuss future plans.
WDYT?@Xunzhuo @rootfs

Xunzhuo · 2025-10-11T10:34:13Z

can we separate the configuration to different sub pages?

Models:
1. user defined model
2. user defined endpoints
Prompt Guard:
1. configuration for pii modernBERT
2. configuration for jailbreak modernBERT
Similarity Cache:
1. configuration for similarly BERT
Intelligent Routing:
1. configuration for categories, with if enable reasoning, if injected system prompt
2. Reasoning Family
3. Configurations for Classify BERT Model(in-tree, out-tree)
Tools Selection:
1. Configurations for Tools
2. Tools DB
Observability

Xunzhuo · 2025-10-11T10:41:23Z

Then you will get a lot of cool columns at the left side

JaredforReal · 2025-10-11T10:41:53Z

Is Observability meaning Monitoring?

Xunzhuo · 2025-10-11T10:45:52Z

Nope it is one related configuration, monitoring is the dashboard for grafana and tracing(later)

Xunzhuo · 2025-10-11T10:51:34Z

In this PR the separated configuration is just to read, in later, we need to support configure.

JaredforReal · 2025-10-11T10:54:17Z

Nope it is one related configuration, monitoring is the dashboard for grafana and tracing(later)

Get this now

JaredforReal · 2025-10-11T11:11:31Z

@Xunzhuo

Signed-off-by: JaredforReal <[email protected]>

Xunzhuo

Let us 🚀🚀

JaredforReal · 2025-10-11T11:16:29Z

Thanks! 😃

Xunzhuo · 2025-10-11T11:40:56Z

@JaredforReal do u have slack?

JaredforReal · 2025-10-11T11:47:48Z

yes

rootfs · 2025-10-11T14:13:04Z

@JaredforReal this is really cool! thanks for making this happen so quickly! As a followup, would you mind adding an auth factory to support additional auth?

JaredforReal · 2025-10-11T14:19:33Z

@rootfs got u

rootfs · 2025-10-11T14:22:36Z

sorry to be chatty :D

would be great to customize the system prompt injection on the UI too.

JaredforReal · 2025-10-11T14:26:59Z

we can inject system prompt in OpenWeb UI. Do we need another one? @rootfs

Xunzhuo · 2025-10-11T14:36:43Z

yes @JaredforReal, the injection you saw in open webui, it is user defined sys prompt, what vllm sr offered is automatically injection based on the domain and intent, it is configurable, that is what I mentioned above, all of the configurations should be configured in the console not just to view it

JaredforReal · 2025-10-11T14:57:19Z

In config/examples/system_prompt_example.yaml

# Categories with system prompts for different domains
categories:
  - name: math
    description: "Mathematical queries, calculations, and problem solving"
    system_prompt: "You are a mathematics expert. Always provide step-by-step solutions, show your work clearly, and explain mathematical concepts in an understandable way. When solving equations, break down each step and explain the reasoning behind it."
    model_scores:
      - model: openai/gpt-oss-20b
        score: 0.9
        use_reasoning: true

  - name: computer science
    description: "Programming, algorithms, software engineering, and technical topics"
    system_prompt: "You are a computer science expert with deep knowledge of algorithms, data structures, programming languages, and software engineering best practices. Provide clear, practical solutions with well-commented code examples when helpful. Always consider performance, readability, and maintainability."
    model_scores:
      - model: openai/gpt-oss-20b
        score: 0.8
        use_reasoning: true

I got what you said. It's really a good idea, will work on it! @Xunzhuo @rootfs

github-actions bot assigned rootfs and Xunzhuo Oct 10, 2025

Xunzhuo added this to the v0.1 milestone Oct 10, 2025

JaredforReal added 6 commits October 11, 2025 11:20

init

9a142d3

Signed-off-by: JaredforReal <[email protected]>

integrate dashboard to docker compose up

c70411e

Signed-off-by: JaredforReal <[email protected]>

Set Up React + Typescript for Frontend

cf471ae

Signed-off-by: JaredforReal <[email protected]>

refine UI and fix some iframe error

1b0c0b3

Signed-off-by: JaredforReal <[email protected]>

change layout and compose up openwebui

48e15cd

Signed-off-by: JaredforReal <[email protected]>

refine playground page

e6ff731

Signed-off-by: JaredforReal <[email protected]>

JaredforReal force-pushed the dashboard branch from 3827332 to e6ff731 Compare October 11, 2025 05:23

JaredforReal marked this pull request as ready for review October 11, 2025 05:26

JaredforReal requested review from Xunzhuo and rootfs as code owners October 11, 2025 05:26

JaredforReal marked this pull request as draft October 11, 2025 05:37

add config viewer

55adc5e

Signed-off-by: JaredforReal <[email protected]>

JaredforReal marked this pull request as ready for review October 11, 2025 10:34

refine config

bdae576

Signed-off-by: JaredforReal <[email protected]>

Xunzhuo approved these changes Oct 11, 2025

View reviewed changes

Xunzhuo merged commit ed81750 into vllm-project:main Oct 11, 2025
16 of 17 checks passed

joyful-ii-V-I pushed a commit to joyful-ii-V-I/semantic-router that referenced this pull request Oct 13, 2025

feat: Modern Dashboard MVP (vllm-project#388)

189ce77

JaredforReal deleted the dashboard branch October 14, 2025 05:22

feat: Modern Dashboard MVP #388

feat: Modern Dashboard MVP #388

Uh oh!

Conversation

JaredforReal commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 Root Directory

📁 deploy

📁 tools

🎉 Thanks for your contributions!

Uh oh!

JaredforReal commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xunzhuo commented Oct 10, 2025

Uh oh!

JaredforReal commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xunzhuo commented Oct 10, 2025

Uh oh!

Xunzhuo commented Oct 10, 2025

Uh oh!

JaredforReal commented Oct 10, 2025

Uh oh!

Xunzhuo commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JaredforReal commented Oct 10, 2025

Uh oh!

Xunzhuo commented Oct 10, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xunzhuo commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

Xunzhuo commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

Uh oh!

JaredforReal commented Oct 11, 2025

JaredforReal commented Oct 10, 2025 •

edited

Loading

netlify bot commented Oct 10, 2025 •

edited

Loading

github-actions bot commented Oct 10, 2025 •

edited

Loading

📁 `Root Directory`

📁 `deploy`

📁 `tools`

JaredforReal commented Oct 10, 2025 •

edited

Loading

JaredforReal commented Oct 10, 2025 •

edited

Loading

Xunzhuo commented Oct 10, 2025 •

edited

Loading

JaredforReal commented Oct 11, 2025 •

edited

Loading

Xunzhuo commented Oct 11, 2025 •

edited

Loading

Xunzhuo commented Oct 11, 2025 •

edited

Loading

JaredforReal commented Oct 11, 2025 •

edited by Xunzhuo

Loading

JaredforReal commented Oct 11, 2025 •

edited

Loading