@@ -13,6 +13,7 @@ Shows real-time classification, routing, and security decisions:
1313```
1414
1515** What it shows:**
16+
1617- 📨 ** Incoming requests** with user prompts
1718- 🛡️ ** Security checks** (jailbreak detection)
1819- 🔍 ** Classification** (category detection with confidence)
@@ -33,17 +34,20 @@ python3 deploy/openshift/demo/demo-semantic-router.py
3334```
3435
3536** Features:**
37+
36381 . ** Single Classification** - Tests random prompt from golden set
37392 . ** All Classifications** - Tests all 10 golden prompts
38403 . ** PII Detection Test** - Tests personal information filtering
39414 . ** Jailbreak Detection Test** - Tests security filtering
40425 . ** Run All Tests** - Executes all tests sequentially
4143
4244** Requirements:**
45+
4346- ✅ Must be logged into OpenShift (` oc login ` )
4447- URLs are discovered automatically from routes
4548
4649** What it does:**
50+
4751- Goes through Envoy (same path as OpenWebUI)
4852- Shows routing decisions and response previews
4953- ** Appears in Grafana dashboard!**
@@ -76,9 +80,11 @@ python3 deploy/openshift/demo/demo-semantic-router.py
7680 - Show the architecture diagram
7781
78822 . ** Run interactive demo** (Terminal 2)
83+
7984 ``` bash
8085 python3 deploy/openshift/demo/demo-semantic-router.py
8186 ```
87+
8288 Choose option 2 (All Classifications)
8389
84903 . ** Point to live logs** (Terminal 1)
@@ -102,26 +108,31 @@ python3 deploy/openshift/demo/demo-semantic-router.py
102108## Key Talking Points
103109
104110### Classification Accuracy
111+
105112- ** 10 golden prompts** with 100% accuracy
106113- Categories: Chemistry, History, Psychology, Health, Math
107114- Shows consistent classification behavior
108115
109116### Security Features
117+
110118- ** Jailbreak detection** on every request
111119- Shows "BENIGN" for safe requests
112120- Confidence scores displayed
113121
114122### Smart Routing
123+
115124- Automatic model selection based on content
116125- Load balancing across Model-A and Model-B
117126- Routing decisions visible in logs
118127
119128### Performance
129+
120130- ** Semantic caching** reduces latency
121131- Cache hits shown in logs with similarity scores
122132- Sub-second response times
123133
124134### Observability
135+
125136- Real-time logs with structured JSON
126137- Grafana metrics and dashboards
127138- Request tracing and debugging
@@ -131,6 +142,7 @@ python3 deploy/openshift/demo/demo-semantic-router.py
131142## Troubleshooting
132143
133144### Log viewer shows no output
145+
134146``` bash
135147# Check if semantic-router pod is running
136148oc get pods -n vllm-semantic-router-system | grep semantic-router
@@ -140,6 +152,7 @@ oc logs -n vllm-semantic-router-system deployment/semantic-router --tail=20
140152```
141153
142154### Classification test fails
155+
143156``` bash
144157# Verify Envoy route is accessible
145158curl http://envoy-http-vllm-semantic-router-system.apps.cluster-pbd96.pbd96.sandbox5333.opentlc.com/v1/models
@@ -149,6 +162,7 @@ oc get pods -n vllm-semantic-router-system
149162```
150163
151164### Grafana doesn't show metrics
165+
152166- Wait 15-30 seconds for metrics to appear
153167- Refresh the dashboard
154168- Check the time range (last 5 minutes)
@@ -158,13 +172,15 @@ oc get pods -n vllm-semantic-router-system
158172## Cache Management
159173
160174### Check Cache Status
175+
161176``` bash
162177./deploy/openshift/demo/cache-management.sh status
163178```
164179
165180Shows recent cache activity and cached queries.
166181
167182### Clear Cache (for demo)
183+
168184``` bash
169185./deploy/openshift/demo/cache-management.sh clear
170186```
@@ -176,22 +192,27 @@ Restarts semantic-router deployment to clear in-memory cache (~30 seconds).
176192** Workflow to show caching in action:**
177193
1781941 . Clear the cache:
195+
179196 ``` bash
180197 ./deploy/openshift/demo/cache-management.sh clear
181198 ```
182199
1832002 . Run classification test (first time - no cache):
201+
184202 ``` bash
185203 python3 deploy/openshift/demo/demo-semantic-router.py
186204 ```
205+
187206 Choose option 2 (All Classifications)
188207 - Processing time: ~ 3-4 seconds per query
189208 - Logs show queries going to model
190209
1912103 . Run classification test again (second time - with cache):
211+
192212 ``` bash
193213 python3 deploy/openshift/demo/demo-semantic-router.py
194214 ```
215+
195216 Choose option 2 (All Classifications) again
196217 - Processing time: ~ 400ms per query (10x faster!)
197218 - Logs show "💾 CACHE HIT" for all queries
0 commit comments