Skip to content

Commit 1405550

Browse files
committed
update
Signed-off-by: bitliu <[email protected]>
1 parent 5ee0143 commit 1405550

File tree

5 files changed

+149
-3
lines changed

5 files changed

+149
-3
lines changed

deploy/kubernetes/aibrix/aigw-resources/gwapi-resources.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ spec:
1919
authority: semantic-router.vllm-semantic-router-system:50051
2020
clusterName: semantic-router
2121
timeout: 60s
22-
message_timeout: 10s
22+
message_timeout: 60s
2323
processing_mode:
2424
request_body_mode: BUFFERED
2525
request_header_mode: SEND
@@ -33,7 +33,7 @@ spec:
3333
op: add
3434
path: ''
3535
value:
36-
connect_timeout: 10s
36+
connect_timeout: 60s
3737
http2_protocol_options: {}
3838
lb_policy: ROUND_ROBIN
3939
load_assignment:

docs/default-model-fallback.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
# Default Model Fallback Implementation
2+
3+
## Problem
4+
5+
When classification fails to identify a category (e.g., empty category name), the system was not selecting any model, leaving `selectedModel` empty. This could cause issues downstream in the request processing pipeline.
6+
7+
### Example Scenario
8+
```
9+
User Request: "Hello" (generic greeting)
10+
Classification Result: category="" (no specific category matched)
11+
Selected Model: "" (empty - no model selected!)
12+
```
13+
14+
## Solution
15+
16+
Implemented a fallback mechanism to use the configured `default_model` when no category is classified.
17+
18+
### Changes Made
19+
20+
#### Modified `performClassificationAndModelSelection` in `req_filter_classification.go`
21+
22+
**Before:**
23+
```go
24+
// Select best model for this category
25+
if categoryName != "" {
26+
selectedModel = r.Classifier.SelectBestModelForCategory(categoryName)
27+
logging.Infof("Selected model for category %s: %s", categoryName, selectedModel)
28+
} else {
29+
// Empty else block - no model selected!
30+
}
31+
32+
return categoryName, classificationConfidence, reasoningDecision, selectedModel
33+
```
34+
35+
**After:**
36+
```go
37+
// Select best model for this category
38+
if categoryName != "" {
39+
selectedModel = r.Classifier.SelectBestModelForCategory(categoryName)
40+
logging.Infof("Selected model for category %s: %s", categoryName, selectedModel)
41+
} else {
42+
// No category found, use default model
43+
selectedModel = r.Config.DefaultModel
44+
logging.Infof("No category classified, using default model: %s", selectedModel)
45+
}
46+
47+
return categoryName, classificationConfidence, reasoningDecision, selectedModel
48+
```
49+
50+
## Behavior
51+
52+
### Before Fix
53+
| Scenario | Category | Selected Model | Issue |
54+
|----------|----------|----------------|-------|
55+
| Specific query ("Write Python code") | `coding` | `coding-expert-model` | ✅ Works |
56+
| Generic query ("Hello") | `` (empty) | `` (empty) |**No model selected** |
57+
| Classification error | `` (empty) | `` (empty) |**No model selected** |
58+
59+
### After Fix
60+
| Scenario | Category | Selected Model | Status |
61+
|----------|----------|----------------|--------|
62+
| Specific query ("Write Python code") | `coding` | `coding-expert-model` | ✅ Works |
63+
| Generic query ("Hello") | `` (empty) | `default-model` |**Fallback to default** |
64+
| Classification error | `` (empty) | `default-model` |**Fallback to default** |
65+
66+
## Configuration
67+
68+
The default model is configured in the router configuration:
69+
70+
```yaml
71+
# config.yaml
72+
default_model: "base-model" # This model will be used when no category is classified
73+
74+
categories:
75+
- name: coding
76+
model_scores:
77+
- model: base-model
78+
lora_name: coding-expert
79+
score: 0.9
80+
81+
- name: math
82+
model_scores:
83+
- model: base-model
84+
lora_name: math-expert
85+
score: 0.9
86+
```
87+
88+
## Impact
89+
90+
- **Robustness**: System always has a model selected, even when classification fails
91+
- **User Experience**: Generic queries are handled gracefully instead of failing
92+
- **Backward Compatibility**: Existing configurations continue to work
93+
- **Logging**: Clear log messages indicate when default model is used
94+
95+
## Related Components
96+
97+
This change works in conjunction with:
98+
1. **PII Policy Checker**: Default model's PII policy will be checked
99+
2. **Model Selection**: Default model must be configured in `model_config`
100+
3. **LoRA Fallback**: If default model uses LoRA, it will inherit base model's PII policy (see `pii-lora-fallback-implementation.md`)
101+
102+
## Example Logs
103+
104+
### When category is classified:
105+
```
106+
{"level":"info","msg":"Classification Result: category=coding, confidence=0.950, reasoning=true"}
107+
{"level":"info","msg":"Selected model for category coding: coding-expert"}
108+
```
109+
110+
### When no category is classified (new behavior):
111+
```
112+
{"level":"info","msg":"Classification Result: category=, confidence=0.000, reasoning=false"}
113+
{"level":"info","msg":"No category classified, using default model: base-model"}
114+
```
115+
116+
## Files Modified
117+
118+
1. `src/semantic-router/pkg/extproc/req_filter_classification.go` - Added default model fallback logic
119+
120+
## Testing
121+
122+
To test this behavior:
123+
124+
1. Send a generic query that doesn't match any category:
125+
```bash
126+
curl -X POST http://localhost:8080/v1/chat/completions \
127+
-H "Content-Type: application/json" \
128+
-d '{
129+
"model": "MoM",
130+
"messages": [{"role": "user", "content": "Hello"}]
131+
}'
132+
```
133+
134+
2. Check logs for "No category classified, using default model" message
135+
136+
3. Verify the request is processed with the default model
137+

src/semantic-router/pkg/extproc/processor_req_body.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ func (r *OpenAIRouter) handleModelRouting(openAIRequest *openai.ChatCompletionNe
116116

117117
isAutoModel := r.Config != nil && r.Config.IsAutoModelName(originalModel)
118118

119-
if isAutoModel && categoryName != "" && selectedModel != "" {
119+
if isAutoModel && selectedModel != "" {
120120
return r.handleAutoModelRouting(openAIRequest, originalModel, categoryName, reasoningDecision, selectedModel, ctx, response)
121121
} else if !isAutoModel {
122122
return r.handleSpecifiedModelRouting(openAIRequest, originalModel, ctx)
@@ -253,6 +253,9 @@ func (r *OpenAIRouter) modifyRequestBodyForAutoRouting(openAIRequest *openai.Cha
253253
return nil, status.Errorf(codes.Internal, "error serializing modified request: %v", err)
254254
}
255255

256+
if categoryName == "" {
257+
return modifiedBody, nil
258+
}
256259
// Set reasoning mode
257260
modifiedBody, err = r.setReasoningModeToRequestBody(modifiedBody, useReasoning, categoryName)
258261
if err != nil {

src/semantic-router/pkg/extproc/req_filter_classification.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@ func (r *OpenAIRouter) performClassificationAndModelSelection(originalModel stri
5555
if categoryName != "" {
5656
selectedModel = r.Classifier.SelectBestModelForCategory(categoryName)
5757
logging.Infof("Selected model for category %s: %s", categoryName, selectedModel)
58+
} else {
59+
// No category found, use default model
60+
selectedModel = r.Config.DefaultModel
61+
logging.Infof("No category classified, using default model: %s", selectedModel)
5862
}
5963

6064
return categoryName, classificationConfidence, reasoningDecision, selectedModel

src/semantic-router/pkg/extproc/req_filter_pii.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ func (r *OpenAIRouter) isPIIDetectionEnabled(categoryName string) bool {
4040
piiThreshold := float32(0.0)
4141
if categoryName != "" && r.Config != nil {
4242
piiThreshold = r.Config.GetPIIThresholdForCategory(categoryName)
43+
} else {
44+
piiThreshold = r.Config.PIIModel.Threshold
4345
}
4446

4547
if piiThreshold == 0.0 {

0 commit comments

Comments
 (0)