@@ -35,11 +35,12 @@ python3 deploy/openshift/demo/demo-semantic-router.py
3535
3636** Features:**
3737
38- 1 . ** Single Classification** - Tests random prompt from golden set
38+ 1 . ** Single Classification** - Tests cache with same prompt (fast repeated runs)
39392 . ** All Classifications** - Tests all 10 golden prompts
40- 3 . ** PII Detection Test** - Tests personal information filtering
41- 4 . ** Jailbreak Detection Test** - Tests security filtering
42- 5 . ** Run All Tests** - Executes all tests sequentially
40+ 3 . ** Reasoning Showcase** - Chain-of-Thought vs Standard routing
41+ 4 . ** PII Detection Test** - Tests personal information filtering
42+ 5 . ** Jailbreak Detection Test** - Tests security filtering
43+ 6 . ** Run All Tests** - Executes all tests sequentially
4344
4445** Requirements:**
4546
@@ -55,6 +56,55 @@ python3 deploy/openshift/demo/demo-semantic-router.py
5556
5657---
5758
59+ ### 3. Distributed Tracing with Jaeger
60+
61+ Visualize the complete request flow with distributed tracing:
62+
63+ #### Deploy Jaeger
64+
65+ ``` bash
66+ ./deploy/openshift/demo/deploy-jaeger.sh
67+ ```
68+
69+ This deploys Jaeger all-in-one with:
70+
71+ - 📊 ** Jaeger UI** for visualizing traces
72+ - 🔗 ** OTLP collector** (gRPC and HTTP)
73+ - 💾 ** In-memory storage** (demo-friendly)
74+
75+ #### Enable/Disable Tracing
76+
77+ ``` bash
78+ # Enable tracing
79+ ./deploy/openshift/demo/toggle-tracing.sh enable
80+
81+ # Disable tracing
82+ ./deploy/openshift/demo/toggle-tracing.sh disable
83+
84+ # Check status
85+ ./deploy/openshift/demo/toggle-tracing.sh status
86+ ```
87+
88+ #### What You'll See in Jaeger
89+
90+ After enabling tracing and running some requests:
91+
92+ 1 . Open Jaeger UI (URL shown by toggle-tracing.sh status)
93+ 2 . Select service: ** vllm-semantic-router**
94+ 3 . Click ** Find Traces**
95+ 4 . Click on a trace to see:
96+ - 📥 Request ingress through Envoy
97+ - 🔄 ExtProc classification pipeline
98+ - 🛡️ Security checks (jailbreak, PII)
99+ - 🎯 Category classification
100+ - 🧭 Model routing decisions
101+ - 💾 Cache hits/misses
102+ - ⏱️ End-to-end latency breakdown
103+
104+ ** Tip:** Run some requests with ` ./deploy/openshift/demo/curl-examples.sh all ` to generate multiple traces!
105+
106+ ---
107+
58108## Demo Flow Suggestion
59109
60110### Setup (Before Demo)
@@ -67,41 +117,57 @@ python3 deploy/openshift/demo/demo-semantic-router.py
67117# (don't run yet)
68118
69119# Browser Tab 1: Open Grafana
70- # http://grafana-vllm-semantic-router-system.apps.cluster-pbd96.pbd96.sandbox5333 .opentlc.com
120+ # http://grafana-vllm-semantic-router-system.apps.cluster-xxx .opentlc.com
71121
72- # Browser Tab 2: Open OpenWebUI
73- # http://openwebui-vllm-semantic-router-system.apps.cluster-pbd96.pbd96.sandbox5333.opentlc.com
122+ # Browser Tab 2: Open Jaeger (if tracing enabled)
123+ # http://jaeger-vllm-semantic-router-system.apps.cluster-xxx.opentlc.com
124+
125+ # Browser Tab 3: Open Flow Visualization
126+ # http://flow-visualization-vllm-semantic-router-system.apps.cluster-xxx.opentlc.com
127+
128+ # Browser Tab 4: Open OpenWebUI
129+ # http://openwebui-vllm-semantic-router-system.apps.cluster-xxx.opentlc.com
74130```
75131
76132### During Demo
77133
781341 . ** Show the system overview**
135+ - Open Flow Visualization (Browser Tab 3)
136+ - Click "Start Animation" to show request flow
79137 - Explain semantic routing concept
80- - Show the architecture diagram
81138
821392 . ** Run interactive demo** (Terminal 2)
83140
84141 ``` bash
85142 python3 deploy/openshift/demo/demo-semantic-router.py
86143 ```
87144
88- Choose option 2 (All Classifications)
145+ - Choose option 3 (Reasoning Showcase) to demonstrate CoT
146+ - Then option 2 (All Classifications)
89147
901483 . ** Point to live logs** (Terminal 1)
91149 - Show real-time classification
92150 - Highlight security checks (jailbreak: BENIGN)
93151 - Show routing decisions (Model-A vs Model-B)
94- - Point out cache hits
152+ - Point out cache hits and reasoning mode activation
95153
961544 . ** Switch to Grafana** (Browser Tab 1)
97155 - Show request metrics appearing
98156 - Show classification category distribution
99157 - Show model usage breakdown
100158
101- 5 . ** Show OpenWebUI integration** (Browser Tab 2)
159+ 5 . ** Show Jaeger traces** (Browser Tab 2) - * Optional but impressive!*
160+ - Select service: vllm-semantic-router
161+ - Click "Find Traces"
162+ - Click on a trace to show:
163+ - Full request flow timeline
164+ - Security checks, classification, routing
165+ - Latency breakdown per step
166+
167+ 6 . ** Show OpenWebUI integration** (Browser Tab 4)
102168 - Type one of the golden prompts
103169 - Watch it appear in logs (Terminal 1)
104- - Show the same routing happening
170+ - Check the trace in Jaeger (Browser Tab 2)
105171
106172---
107173
@@ -135,7 +201,16 @@ python3 deploy/openshift/demo/demo-semantic-router.py
135201
136202- Real-time logs with structured JSON
137203- Grafana metrics and dashboards
138- - Request tracing and debugging
204+ - ** Distributed tracing** with Jaeger (OpenTelemetry)
205+ - End-to-end request flow visualization
206+ - Per-span latency breakdown
207+
208+ ### Reasoning Capabilities
209+
210+ - ** Chain-of-Thought (CoT)** for complex problems
211+ - Enabled for math, chemistry, physics
212+ - Standard routing for factual queries
213+ - Automatic reasoning mode detection
139214
140215---
141216
@@ -231,6 +306,8 @@ Restarts semantic-router deployment to clear in-memory cache (~30 seconds).
231306- ` cache-management.sh ` - Cache management helper
232307- ` flow-visualization.html ` - ** Interactive flow visualization** (open in browser)
233308- ` deploy-flow-viz.sh ` - Deploy flow visualization to OpenShift
309+ - ` deploy-jaeger.sh ` - Deploy Jaeger distributed tracing
310+ - ` toggle-tracing.sh ` - Enable/disable tracing in semantic-router
234311- ` CATEGORY-MODEL-MAPPING.md ` - Category to model routing reference
235312- ` demo-classification-results.json ` - Test results (auto-generated)
236313
0 commit comments