Skip to content

Commit d161686

Browse files
authored
Merge branch 'main' into fix/response-model-field-incorrect-clean
2 parents d750fee + 058541e commit d161686

39 files changed

+5433
-289
lines changed

README.md

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
[![Crates.io](https://img.shields.io/crates/v/candle-semantic-router.svg)](https://crates.io/crates/candle-semantic-router)
99
![Test And Build](https://github.com/vllm-project/semantic-router/workflows/Test%20And%20Build/badge.svg)
1010

11-
**📚 [Complete Documentation](https://vllm-semantic-router.com) | 🚀 [Quick Start](https://vllm-semantic-router.com/docs/installation) | 📣 [Blog](https://vllm-semantic-router.com/blog/) | 📖 [API Reference](https://vllm-semantic-router.com/docs/api/router/)**
11+
**📚 [Complete Documentation](https://vllm-semantic-router.com) | 🚀 [Quick Start](https://vllm-semantic-router.com/docs/installation) | 📣 [Blog](https://vllm-semantic-router.com/blog/) | 📖 [Publications](https://vllm-semantic-router.com/publications/)**
1212

1313
![code](./website/static/img/code.png)
1414

@@ -64,20 +64,33 @@ Cache the semantic representation of the prompt so as to reduce the number of pr
6464

6565
### Distributed Tracing 🔍
6666

67-
Comprehensive observability with OpenTelemetry distributed tracing provides fine-grained visibility into the request processing pipeline:
68-
69-
- **Request Flow Tracing**: Track requests through classification, security checks, caching, and routing
70-
- **Performance Analysis**: Identify bottlenecks with detailed timing for each operation
71-
- **Security Monitoring**: Trace PII detection and jailbreak prevention operations
72-
- **Routing Decisions**: Understand why specific models were selected
73-
- **OpenTelemetry Standard**: Industry-standard tracing with support for Jaeger, Tempo, and other OTLP backends
67+
Comprehensive observability with OpenTelemetry distributed tracing provides fine-grained visibility into the request processing pipeline.
7468

7569
### Open WebUI Integration 💬
7670

7771
To view the ***Chain-Of-Thought*** of the vLLM-SR's decision-making process, we have integrated with Open WebUI.
7872

7973
![code](./website/static/img/chat.png)
8074

75+
## Quick Start 🚀
76+
77+
Get up and running in seconds with our interactive setup script:
78+
79+
```bash
80+
bash ./scripts/quickstart.sh
81+
```
82+
83+
This command will:
84+
85+
- 🔍 Check all prerequisites automatically
86+
- 📦 Install HuggingFace CLI if needed
87+
- 📥 Download all required AI models (~1.5GB)
88+
- 🐳 Start all Docker services
89+
- ⏳ Wait for services to become healthy
90+
- 🌐 Show you all the endpoints and next steps
91+
92+
For detailed installation and configuration instructions, see the [Complete Documentation](https://vllm-semantic-router.com/docs/installation/).
93+
8194
## Documentation 📖
8295

8396
For comprehensive documentation including detailed setup instructions, architecture guides, and API references, visit:

config/config-mcp-classifier-example.yaml

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,19 @@ classifier:
4545
#
4646
# How it works:
4747
# 1. Router connects to MCP server at startup
48-
# 2. Calls 'list_categories' tool: MCP returns {"categories": ["business", "law", ...]}
48+
# 2. Calls 'list_categories' tool and MCP returns:
49+
# {
50+
# "categories": ["math", "science", "technology", "history", "general"],
51+
# "category_system_prompts": {
52+
# "math": "You are a mathematics expert. When answering math questions...",
53+
# "science": "You are a science expert. When answering science questions...",
54+
# "technology": "You are a technology expert..."
55+
# },
56+
# "category_descriptions": {
57+
# "math": "Mathematical and computational queries",
58+
# "science": "Scientific concepts and queries"
59+
# }
60+
# }
4961
# 3. For each request, calls 'classify_text' tool which returns:
5062
# {
5163
# "class": 3,
@@ -55,14 +67,28 @@ classifier:
5567
# }
5668
# 4. Router uses the model and reasoning settings from MCP response
5769
#
70+
# PER-CATEGORY SYSTEM PROMPT INJECTION:
71+
# - The MCP server provides SEPARATE system prompts for EACH category
72+
# - Each category gets its own specialized instructions and context
73+
# - The router stores these prompts and injects the appropriate one per query
74+
# - Use classifier.GetCategorySystemPrompt(categoryName) to retrieve for a specific category
75+
# - Examples:
76+
# * Math category: "You are a mathematics expert. Show step-by-step solutions..."
77+
# * Science category: "You are a science expert. Provide evidence-based answers..."
78+
# * Technology category: "You are a tech expert. Include practical code examples..."
79+
# - This allows domain-specific expertise per category
80+
#
5881
# BENEFITS:
5982
# - MCP server makes intelligent routing decisions per query
6083
# - No hardcoded routing rules needed in config
6184
# - MCP can adapt routing based on query complexity, content, etc.
62-
# - Centralized routing logic in MCP server
85+
# - Centralized routing logic and per-category system prompts in MCP server
86+
# - Category descriptions available for logging and debugging
87+
# - Domain-specific LLM behavior for each category
6388
#
6489
# FALLBACK:
6590
# - If MCP doesn't return model/use_reasoning, uses default_model below
91+
# - If MCP doesn't return category_system_prompts, router can use default prompts
6692
# - Can also add category-specific overrides here if needed
6793
#
6894
categories: []

dashboard/backend/main.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ func newReverseProxy(targetBase, stripPrefix string, forwardAuth bool) (*httputi
161161
r.URL.Path = p
162162
r.Host = targetURL.Host
163163

164+
// Set Origin header to match the target URL for Grafana embedding
165+
// This is required for Grafana to accept the iframe embedding
166+
r.Header.Set("Origin", targetURL.Scheme+"://"+targetURL.Host)
167+
164168
// Optionally forward Authorization header
165169
if !forwardAuth {
166170
r.Header.Del("Authorization")

dashboard/frontend/src/components/Layout.module.css

Lines changed: 170 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,58 +6,127 @@
66
}
77

88
.sidebar {
9-
width: 240px;
9+
width: 260px;
1010
background-color: var(--color-bg-secondary);
1111
border-right: 1px solid var(--color-border);
1212
display: flex;
1313
flex-direction: column;
14-
padding: 1rem 0.75rem;
15-
gap: 1rem;
14+
padding: 1.25rem 0.75rem;
15+
gap: 1.5rem;
16+
transition: width var(--transition-fast);
1617
}
1718

18-
.brand {
19+
.sidebarCollapsed {
20+
width: 70px;
21+
}
22+
23+
.brandContainer {
1924
display: flex;
2025
align-items: center;
2126
gap: 0.5rem;
22-
padding: 0 0.5rem;
27+
padding: 0 0.25rem;
28+
flex-direction: column;
29+
}
30+
31+
.sidebarCollapsed .brandContainer {
32+
flex-direction: column;
33+
gap: 0.75rem;
34+
}
35+
36+
.collapseButton {
37+
display: flex;
38+
align-items: center;
39+
justify-content: center;
40+
width: 32px;
41+
height: 32px;
42+
padding: 6px;
43+
background: transparent;
44+
border: none;
45+
border-radius: var(--radius-md);
46+
cursor: pointer;
47+
color: var(--color-text-secondary);
48+
transition: all var(--transition-fast);
49+
flex-shrink: 0;
50+
}
51+
52+
.collapseButton:hover {
53+
background-color: var(--color-bg-tertiary);
54+
color: var(--color-text);
55+
}
56+
57+
.collapseButton svg {
58+
width: 20px;
59+
height: 20px;
60+
flex-shrink: 0;
61+
}
62+
63+
.brand {
64+
display: flex;
65+
align-items: center;
66+
gap: 0.625rem;
67+
padding: 0.25rem 0.375rem;
2368
text-decoration: none;
2469
border-radius: var(--radius-md);
2570
transition: background-color var(--transition-fast);
2671
cursor: pointer;
72+
flex: 1;
73+
min-width: 0;
74+
width: 100%;
2775
}
2876

2977
.brand:hover {
3078
background-color: var(--color-bg-tertiary);
3179
}
3280

81+
.sidebarCollapsed .brand {
82+
justify-content: center;
83+
padding: 0.5rem;
84+
}
85+
3386
.logo {
3487
width: 28px;
3588
height: 28px;
3689
object-fit: contain;
90+
flex-shrink: 0;
3791
}
3892

3993
.brandText {
40-
font-size: 0.95rem;
94+
font-size: 0.9375rem;
4195
font-weight: 600;
4296
color: var(--color-text);
97+
white-space: nowrap;
98+
overflow: hidden;
99+
text-overflow: ellipsis;
100+
flex: 1;
101+
min-width: 0;
43102
}
44103

45104
.nav {
46105
display: flex;
47106
flex-direction: column;
48-
gap: 0.25rem;
107+
gap: 0.5rem;
108+
flex: 1;
109+
overflow-y: auto;
110+
overflow-x: hidden;
49111
}
50112

51113
.navLink {
52114
display: flex;
53115
align-items: center;
54-
gap: 0.5rem;
55-
padding: 0.5rem 0.625rem;
116+
gap: 0.75rem;
117+
padding: 0.75rem 0.875rem;
56118
border-radius: var(--radius-md);
57119
color: var(--color-text-secondary);
58-
font-size: 0.9rem;
120+
font-size: 0.9375rem;
59121
font-weight: 500;
60122
transition: all var(--transition-fast);
123+
text-decoration: none;
124+
border: none;
125+
background: transparent;
126+
cursor: pointer;
127+
text-align: left;
128+
width: 100%;
129+
white-space: nowrap;
61130
}
62131

63132
.navLink:hover {
@@ -76,14 +145,26 @@
76145
}
77146

78147
.navIcon {
79-
font-size: 1rem;
148+
font-size: 1.125rem;
80149
line-height: 1;
81-
width: 1.25rem;
150+
width: 1.375rem;
82151
text-align: center;
152+
flex-shrink: 0;
83153
}
84154

85155
.navText {
86156
white-space: nowrap;
157+
overflow: hidden;
158+
text-overflow: ellipsis;
159+
}
160+
161+
.sidebarCollapsed .navLink {
162+
justify-content: center;
163+
padding: 0.75rem 0.5rem;
164+
}
165+
166+
.sidebarCollapsed .navIcon {
167+
width: auto;
87168
}
88169

89170
/* Sub Navigation (Configuration sections) - Match parent nav style */
@@ -188,4 +269,81 @@
188269
flex-direction: column;
189270
overflow: hidden;
190271
min-height: 0;
272+
}
273+
274+
.header {
275+
background-color: var(--color-bg-secondary);
276+
border-bottom: 1px solid var(--color-border);
277+
padding: 0 1.5rem;
278+
height: 60px;
279+
display: flex;
280+
align-items: center;
281+
flex-shrink: 0;
282+
}
283+
284+
.headerContent {
285+
display: flex;
286+
align-items: center;
287+
justify-content: space-between;
288+
width: 100%;
289+
}
290+
291+
.headerLeft {
292+
flex: 1;
293+
}
294+
295+
.headerRight {
296+
display: flex;
297+
align-items: center;
298+
gap: 1.5rem;
299+
}
300+
301+
.headerLink {
302+
color: var(--color-text-secondary);
303+
text-decoration: none;
304+
font-size: 0.9375rem;
305+
font-weight: 500;
306+
transition: color var(--transition-fast);
307+
white-space: nowrap;
308+
}
309+
310+
.headerLink:hover {
311+
color: var(--color-text);
312+
}
313+
314+
.mainContent {
315+
flex: 1;
316+
overflow: auto;
317+
min-height: 0;
318+
}
319+
320+
/* Responsive adjustments */
321+
@media (max-width: 768px) {
322+
.sidebar {
323+
width: 70px;
324+
}
325+
326+
.brandText {
327+
display: none;
328+
}
329+
330+
.navText {
331+
display: none;
332+
}
333+
334+
.collapseButton {
335+
display: none;
336+
}
337+
338+
.sidebarFooter a {
339+
display: none;
340+
}
341+
342+
.header {
343+
padding: 0 1rem;
344+
}
345+
346+
.headerRight {
347+
gap: 1rem;
348+
}
191349
}

0 commit comments

Comments
 (0)