Skip to content

Commit af87950

Browse files
committed
fix: apply markdown linting fixes to demo documentation
Signed-off-by: Yossi Ovadia <[email protected]>
1 parent b830969 commit af87950

File tree

2 files changed

+22
-3
lines changed

2 files changed

+22
-3
lines changed

deploy/openshift/demo/CATEGORY-MODEL-MAPPING.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,6 @@ These prompts have **100% classification accuracy** and route as follows:
5151
| **Health** | "How to maintain a healthy lifestyle?" | Model-B | ~0.221 |
5252
| **Health** | "What is a balanced diet?" | Model-B | ~0.268 |
5353

54-
5554
---
5655

5756
## Reasoning Mode (Chain-of-Thought)
@@ -70,7 +69,6 @@ Categories with **reasoning enabled** use extended thinking for complex problems
7069
- **Fallback Category:** "other" (score: 0.7)
7170
- **Unmatched queries** route to Model-A with the "other" category system prompt
7271

73-
7472
### Key Parameters:
7573

7674
- **name:** Category identifier
@@ -81,7 +79,6 @@ Categories with **reasoning enabled** use extended thinking for complex problems
8179

8280
---
8381

84-
8582
## Confidence Scores Explained
8683

8784
**Why are confidence scores low (0.2-0.4)?**
@@ -92,6 +89,7 @@ Categories with **reasoning enabled** use extended thinking for complex problems
9289
4. **Highest score wins** - 0.326 for "math" means it beat all other 13 categories
9390

9491
**What's important:**
92+
9593
- ✅ Classification is **consistent** across multiple runs
9694
- ✅ Same prompt → same category every time
9795
- ✅ Confidence is **relative** to other categories, not absolute certainty

deploy/openshift/demo/DEMO-README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ Shows real-time classification, routing, and security decisions:
1313
```
1414

1515
**What it shows:**
16+
1617
- 📨 **Incoming requests** with user prompts
1718
- 🛡️ **Security checks** (jailbreak detection)
1819
- 🔍 **Classification** (category detection with confidence)
@@ -33,17 +34,20 @@ python3 deploy/openshift/demo/demo-semantic-router.py
3334
```
3435

3536
**Features:**
37+
3638
1. **Single Classification** - Tests random prompt from golden set
3739
2. **All Classifications** - Tests all 10 golden prompts
3840
3. **PII Detection Test** - Tests personal information filtering
3941
4. **Jailbreak Detection Test** - Tests security filtering
4042
5. **Run All Tests** - Executes all tests sequentially
4143

4244
**Requirements:**
45+
4346
- ✅ Must be logged into OpenShift (`oc login`)
4447
- URLs are discovered automatically from routes
4548

4649
**What it does:**
50+
4751
- Goes through Envoy (same path as OpenWebUI)
4852
- Shows routing decisions and response previews
4953
- **Appears in Grafana dashboard!**
@@ -76,9 +80,11 @@ python3 deploy/openshift/demo/demo-semantic-router.py
7680
- Show the architecture diagram
7781

7882
2. **Run interactive demo** (Terminal 2)
83+
7984
```bash
8085
python3 deploy/openshift/demo/demo-semantic-router.py
8186
```
87+
8288
Choose option 2 (All Classifications)
8389

8490
3. **Point to live logs** (Terminal 1)
@@ -102,26 +108,31 @@ python3 deploy/openshift/demo/demo-semantic-router.py
102108
## Key Talking Points
103109

104110
### Classification Accuracy
111+
105112
- **10 golden prompts** with 100% accuracy
106113
- Categories: Chemistry, History, Psychology, Health, Math
107114
- Shows consistent classification behavior
108115

109116
### Security Features
117+
110118
- **Jailbreak detection** on every request
111119
- Shows "BENIGN" for safe requests
112120
- Confidence scores displayed
113121

114122
### Smart Routing
123+
115124
- Automatic model selection based on content
116125
- Load balancing across Model-A and Model-B
117126
- Routing decisions visible in logs
118127

119128
### Performance
129+
120130
- **Semantic caching** reduces latency
121131
- Cache hits shown in logs with similarity scores
122132
- Sub-second response times
123133

124134
### Observability
135+
125136
- Real-time logs with structured JSON
126137
- Grafana metrics and dashboards
127138
- Request tracing and debugging
@@ -131,6 +142,7 @@ python3 deploy/openshift/demo/demo-semantic-router.py
131142
## Troubleshooting
132143

133144
### Log viewer shows no output
145+
134146
```bash
135147
# Check if semantic-router pod is running
136148
oc get pods -n vllm-semantic-router-system | grep semantic-router
@@ -140,6 +152,7 @@ oc logs -n vllm-semantic-router-system deployment/semantic-router --tail=20
140152
```
141153

142154
### Classification test fails
155+
143156
```bash
144157
# Verify Envoy route is accessible
145158
curl http://envoy-http-vllm-semantic-router-system.apps.cluster-pbd96.pbd96.sandbox5333.opentlc.com/v1/models
@@ -149,6 +162,7 @@ oc get pods -n vllm-semantic-router-system
149162
```
150163

151164
### Grafana doesn't show metrics
165+
152166
- Wait 15-30 seconds for metrics to appear
153167
- Refresh the dashboard
154168
- Check the time range (last 5 minutes)
@@ -158,13 +172,15 @@ oc get pods -n vllm-semantic-router-system
158172
## Cache Management
159173

160174
### Check Cache Status
175+
161176
```bash
162177
./deploy/openshift/demo/cache-management.sh status
163178
```
164179

165180
Shows recent cache activity and cached queries.
166181

167182
### Clear Cache (for demo)
183+
168184
```bash
169185
./deploy/openshift/demo/cache-management.sh clear
170186
```
@@ -176,22 +192,27 @@ Restarts semantic-router deployment to clear in-memory cache (~30 seconds).
176192
**Workflow to show caching in action:**
177193

178194
1. Clear the cache:
195+
179196
```bash
180197
./deploy/openshift/demo/cache-management.sh clear
181198
```
182199

183200
2. Run classification test (first time - no cache):
201+
184202
```bash
185203
python3 deploy/openshift/demo/demo-semantic-router.py
186204
```
205+
187206
Choose option 2 (All Classifications)
188207
- Processing time: ~3-4 seconds per query
189208
- Logs show queries going to model
190209

191210
3. Run classification test again (second time - with cache):
211+
192212
```bash
193213
python3 deploy/openshift/demo/demo-semantic-router.py
194214
```
215+
195216
Choose option 2 (All Classifications) again
196217
- Processing time: ~400ms per query (10x faster!)
197218
- Logs show "💾 CACHE HIT" for all queries

0 commit comments

Comments
 (0)