Skip to content

Commit 5d0e1bd

Browse files
committed
update
Signed-off-by: bitliu <[email protected]>
1 parent abc287c commit 5d0e1bd

File tree

9 files changed

+7
-145
lines changed

9 files changed

+7
-145
lines changed

_posts/2025-11-18-signal-decision.md

Lines changed: 7 additions & 145 deletions
Original file line numberDiff line numberDiff line change
@@ -78,22 +78,7 @@ The Signal-Decision Architecture fundamentally reimagines semantic routing by se
7878

7979
### Architecture Overview
8080

81-
```mermaid
82-
graph TB
83-
A[User Request] --> B[Multi-Signal Extraction]
84-
B --> C[Keyword Signals]
85-
B --> D[Embedding Signals]
86-
B --> E[Domain Signals]
87-
C --> F[Decision Engine]
88-
D --> F
89-
E --> F
90-
F --> G{Decision Matched?}
91-
G -->|Yes| H[Plugin Chain]
92-
G -->|No| I[Default Model]
93-
H --> J[Model Selection]
94-
I --> K[Response]
95-
J --> K
96-
```
81+
![](/assets/figures/semantic-router/signal-1.png)
9782

9883
The new architecture introduces three key innovations:
9984

@@ -103,65 +88,15 @@ The new architecture introduces three key innovations:
10388

10489
### Complete Request Flow
10590

106-
```mermaid
107-
sequenceDiagram
108-
participant User
109-
participant Gateway as AI Gateway
110-
participant Extractor as Signal Extractor
111-
participant Engine as Decision Engine
112-
participant Plugins as Plugin Chain
113-
participant Model as Selected Model
114-
115-
User->>Gateway: Send Query
116-
Gateway->>Extractor: Extract Signals
117-
118-
par Parallel Signal Extraction
119-
Extractor->>Extractor: Keyword Matching
120-
Extractor->>Extractor: Embedding Similarity
121-
Extractor->>Extractor: Domain Classification
122-
end
123-
124-
Extractor->>Engine: Signals: {keyword, embedding, domain}
125-
Engine->>Engine: Evaluate All Decisions
126-
Engine->>Engine: Select by Priority
127-
Engine->>Plugins: Execute Plugin Chain
128-
129-
Plugins->>Plugins: 1. Jailbreak Detection
130-
Plugins->>Plugins: 2. PII Protection
131-
Plugins->>Plugins: 3. Semantic Cache Check
132-
133-
alt Cache Hit
134-
Plugins->>User: Return Cached Response
135-
else Cache Miss
136-
Plugins->>Plugins: 4. System Prompt Injection
137-
Plugins->>Plugins: 5. Header Mutation
138-
Plugins->>Model: Route Request
139-
Model->>Plugins: Model Response
140-
Plugins->>Plugins: Cache Response
141-
Plugins->>User: Return Response
142-
end
143-
```
91+
![](/assets/figures/semantic-router/signal-2.png)
14492

14593
## Core Concepts
14694

14795
### Signals: Multi-Dimensional Prompt Analysis
14896

14997
Instead of relying solely on domain classification, the Signal-Decision Architecture extracts three complementary types of signals from each user query. Each signal type leverages different AI/ML techniques and serves distinct purposes in the routing decision process.
15098

151-
```mermaid
152-
graph TB
153-
A[User Query] --> B[Signal Extraction Layer]
154-
B --> C[Keyword Extractor<br/>Regex Matching]
155-
B --> D[Embedding Model<br/>Sentence Transformers]
156-
B --> E[Domain Classifier<br/>MMLU + LoRA]
157-
C --> F[Keyword Signals<br/>urgency, security, etc.]
158-
D --> G[Embedding Signals<br/>intent similarity scores]
159-
E --> H[Domain Signals<br/>computer_science, etc.]
160-
F --> I[Decision Engine]
161-
G --> I
162-
H --> I
163-
I --> J[Selected Decision]
164-
```
99+
![](/assets/figures/semantic-router/signal-3.png)
165100

166101
#### Keyword Signals: Interpretable Pattern Matching
167102

@@ -222,22 +157,7 @@ Domain signals use MMLU-trained classification models to identify the academic o
222157

223158
This enables organizations to extend domain classification to their specific verticals while maintaining the base model's general knowledge.
224159

225-
```mermaid
226-
graph TB
227-
A[User Query:<br/>'Review this medical imaging protocol'] --> B[Domain Classifier]
228-
B --> C{Base MMLU Model}
229-
C --> D[Detect: Healthcare Domain]
230-
D --> E{Load LoRA Adapter}
231-
E --> F[medical_imaging LoRA]
232-
E --> G[clinical_trials LoRA]
233-
E --> H[pharmaceutical_research LoRA]
234-
F --> I[Fine-grained Classification:<br/>medical_imaging]
235-
I --> J[Route to Specialized Model:<br/>medical-imaging-expert]
236-
237-
style F fill:#e1f5ff
238-
style I fill:#c3e6cb
239-
style J fill:#d4edda
240-
```
160+
![](/assets/figures/semantic-router/signal-4.png)
241161

242162
**Use Cases**:
243163

@@ -297,34 +217,7 @@ Each decision consists of:
297217

298218
#### Decision Evaluation Flow
299219

300-
```mermaid
301-
graph TB
302-
A[Extracted Signals] --> B{Evaluate All Decisions}
303-
B --> C[Decision 1: Priority 100<br/>Rule: urgency AND security AND cs]
304-
B --> D[Decision 2: Priority 80<br/>Rule: code-review AND cs]
305-
B --> E[Decision 3: Priority 60<br/>Rule: architecture-design OR cs]
306-
C --> F{Match?}
307-
D --> G{Match?}
308-
E --> H{Match?}
309-
F -->|Yes| I[Matched: Priority 100]
310-
F -->|No| J[Not Matched]
311-
G -->|Yes| K[Matched: Priority 80]
312-
G -->|No| L[Not Matched]
313-
H -->|Yes| M[Matched: Priority 60]
314-
H -->|No| N[Not Matched]
315-
I --> O{Multiple Matches?}
316-
K --> O
317-
M --> O
318-
J --> P{Any Match?}
319-
L --> P
320-
N --> P
321-
O -->|Yes| Q[Select Highest Priority]
322-
O -->|No| R[Use Single Match]
323-
P -->|No| S[Fallback to Default Model]
324-
Q --> T[Execute Plugin Chain]
325-
R --> T
326-
S --> U[Route to Default Model]
327-
```
220+
![](/assets/figures/semantic-router/signal-5.png)
328221

329222
When multiple decisions match, the system selects the one with the highest priority. If no decisions match, the system falls back to the default model.
330223

@@ -344,20 +237,7 @@ Plugins execute in the configured order, with each plugin able to modify the req
344237

345238
#### Plugin Chain Execution Flow
346239

347-
```mermaid
348-
graph LR
349-
A[Incoming Request] --> B[Plugin 1: Jailbreak Detection]
350-
B -->|Pass| C[Plugin 2: PII Protection]
351-
B -->|Block| Z[Return Error]
352-
C -->|Redact PII| D[Plugin 3: Semantic Cache]
353-
D -->|Cache Hit| Y[Return Cached Response]
354-
D -->|Cache Miss| E[Plugin 4: System Prompt Injection]
355-
E -->|Add Prompt| F[Plugin 5: Header Mutation]
356-
F -->|Add Headers| G[Route to Model]
357-
G --> H[Model Response]
358-
H --> I[Cache Response]
359-
I --> J[Return to User]
360-
```
240+
![](/assets/figures/semantic-router/signal-6.png)
361241

362242
## Scaling from 14 to Unlimited
363243

@@ -636,25 +516,7 @@ This configuration demonstrates:
636516
637517
### Dynamic Configuration Flow
638518
639-
```mermaid
640-
sequenceDiagram
641-
participant User
642-
participant K8s
643-
participant Controller
644-
participant Router
645-
participant Model
646-
647-
User->>K8s: Apply IntelligentRoute CRD
648-
K8s->>Controller: Watch CRD changes
649-
Controller->>Controller: Convert CRD to internal config
650-
Controller->>Router: Update routing rules
651-
Router->>Router: Reload configuration
652-
Note over Router: No downtime
653-
User->>Router: Send request
654-
Router->>Router: Evaluate with new rules
655-
Router->>Model: Route to selected model
656-
Model->>User: Return response
657-
```
519+
![](/assets/figures/semantic-router/signal-7.png)
658520
659521
The Kubernetes-native design enables:
660522
136 KB
Loading
438 KB
Loading
130 KB
Loading
191 KB
Loading
393 KB
Loading
173 KB
Loading
190 KB
Loading
264 KB
Loading

0 commit comments

Comments
 (0)