-
Notifications
You must be signed in to change notification settings - Fork 180
Description
Is your feature request related to a problem? Please describe.
The current semantic routing implementation only supports category-based model selection using a fine-tuned ModernBERT classifier. This approach has critical limitations for keyword-based routing:
- No Deterministic Routing: Cannot route queries based on specific keywords or technology terms (e.g., "Kubernetes", "SQL", "CVE-")
- Slow for Simple Cases: Must run ML inference even for queries that could be routed deterministically by keyword matching
- No Domain-Specific Rules: Cannot define custom routing rules based on business logic or domain knowledge
- Limited Control: Cannot prioritize certain models based on keyword presence
- No Boolean Logic: Cannot combine keywords with AND/OR operators for complex matching
Real-World Scenario:
A user wants to route all Kubernetes-related queries to specialized infrastructure models without running expensive ML inference:
- Query: "How to secure a Kubernetes cluster with RBAC?"
- Expected: Match keywords ["kubernetes", "k8s", "RBAC"] → Route to [k8s-expert, devops-model]
- Current: Must run ModernBERT classifier → Classify as "computer science" → Route to general models
This results in:
- Unnecessary latency (~20-30ms for classification)
- Less precise routing (category is too broad)
- Cannot leverage domain knowledge (e.g., "CVE-" always goes to security models)
Describe the solution you'd like
Implement a Keyword-Based Routing system that allows users to define deterministic routing rules based on keyword matching. The solution should enable:
Core Concept: Define keyword rules in configuration that match query text and return candidate models:
keyword_routing:
enabled: true
rules:
- name: "kubernetes-infrastructure"
keywords:
operator: "OR"
case_sensitive: false
terms: ["kubernetes", "k8s", "kubectl"]
candidate_models:
- "k8s-expert"
- "devops-model"
Key Features:
- Fast Deterministic Matching: O(n) keyword matching (~1-2ms) vs O(1) ML inference (~20-30ms)
- Boolean Logic: Support AND/OR operators for complex keyword combinations
- Case Sensitivity Control: Match case-sensitive terms (e.g., "CVE-2024-1234") or case-insensitive (e.g., "kubernetes")
- Priority-Based Rules: Execute rules in priority order for overlapping keywords
- Multiple Rule Matching: Support matching multiple rules and combining their candidate models
Architecture Overview
flowchart TD
A[User Query] --> B[KeywordMatcher]
B --> C{Parse Rules}
C --> D[Rule 1: Check Keywords]
C --> E[Rule 2: Check Keywords]
C --> F[Rule N: Check Keywords]
D --> G{Match?}
E --> H{Match?}
F --> I{Match?}
G -->|Yes| J[Candidate Models A]
H -->|Yes| K[Candidate Models B]
I -->|Yes| L[Candidate Models N]
G -->|No| M[Skip]
H -->|No| M
I -->|No| M
J --> N[Combine All Candidates]
K --> N
L --> N
N --> O[Deduplicate Models]
O --> P[Return Candidate Models]
style A fill:#e1f5ff
style P fill:#c8e6c9
style B fill:#fff9c4
Configuration Schema
keyword_routing:
enabled: true # Enable/disable keyword routing
rules:
# Rule 1: Kubernetes Infrastructure
- name: "kubernetes-infrastructure"
description: "Route Kubernetes-related queries to infrastructure models"
keywords:
operator: "OR" # Match if ANY keyword is present
case_sensitive: false
terms:
- "kubernetes"
- "k8s"
- "kubectl"
- "helm"
- "pod"
- "deployment"
- "service"
- "ingress"
candidate_models:
- "k8s-expert"
- "devops-model"
- "cloud-native-model"
priority: 100 # Higher priority rules are checked first
# Rule 2: Database Operations
- name: "database-operations"
description: "Route database queries to database specialists"
keywords:
operator: "AND" # Match if ALL keywords are present
case_sensitive: false
terms:
- "database"
- "query"
candidate_models:
- "database-expert"
- "sql-specialist"
priority: 90
# Rule 3: Security Critical (Case-Sensitive)
- name: "security-critical"
description: "Route security-related queries with CVE IDs"
keywords:
operator: "OR"
case_sensitive: true # Case-sensitive for CVE IDs
terms:
- "CVE-"
- "vulnerability"
- "exploit"
candidate_models:
- "security-hardened-model"
- "compliance-model"
priority: 95
# Rule 4: Python Programming
- name: "python-programming"
keywords:
operator: "OR"
case_sensitive: false
terms:
- "python"
- "pip"
- "django"
- "flask"
- "pandas"
candidate_models:
- "python-expert"
- "general-coding-model"
priority: 80
Implementation Components
1. KeywordMatcher (pkg/utils/keyword/matcher.go
)
package keyword
type KeywordMatcher struct {
rules []KeywordRule
}
type KeywordRule struct {
Name string
Description string
Keywords KeywordSet
CandidateModels []string
Priority int
}
type KeywordSet struct {
Operator string // "AND" | "OR"
CaseSensitive bool
Terms []string
}
type KeywordMatchResult struct {
RuleName string
MatchedKeywords []string
CandidateModels []string
Priority int
}
// NewKeywordMatcher creates a new keyword matcher from config
func NewKeywordMatcher(config KeywordRoutingConfig) *KeywordMatcher {
// Sort rules by priority (descending)
rules := sortRulesByPriority(config.Rules)
return &KeywordMatcher{rules: rules}
}
// MatchQuery matches the query against all rules and returns matched results
func (m *KeywordMatcher) MatchQuery(query string) []KeywordMatchResult {
results := []KeywordMatchResult{}
for _, rule := range m.rules {
if matched, keywords := m.matchRule(query, rule); matched {
results = append(results, KeywordMatchResult{
RuleName: rule.Name,
MatchedKeywords: keywords,
CandidateModels: rule.CandidateModels,
Priority: rule.Priority,
})
}
}
return results
}
// matchRule checks if a query matches a specific rule
func (m *KeywordMatcher) matchRule(query string, rule KeywordRule) (bool, []string) {
queryText := query
if !rule.Keywords.CaseSensitive {
queryText = strings.ToLower(query)
}
matchedKeywords := []string{}
for _, term := range rule.Keywords.Terms {
keyword := term
if !rule.Keywords.CaseSensitive {
keyword = strings.ToLower(term)
}
if strings.Contains(queryText, keyword) {
matchedKeywords = append(matchedKeywords, term)
}
}
// Apply operator logic
if rule.Keywords.Operator == "AND" {
// All keywords must match
return len(matchedKeywords) == len(rule.Keywords.Terms), matchedKeywords
} else {
// At least one keyword must match (OR)
return len(matchedKeywords) > 0, matchedKeywords
}
}
// GetCandidateModels returns deduplicated candidate models from all matched rules
func (m *KeywordMatcher) GetCandidateModels(results []KeywordMatchResult) []string {
modelSet := make(map[string]bool)
for _, result := range results {
for _, model := range result.CandidateModels {
modelSet[model] = true
}
}
models := []string{}
for model := range modelSet {
models = append(models, model)
}
return models
}
2. Configuration Extension (pkg/config/config.go
)
type RouterConfig struct {
// ... existing fields ...
KeywordRouting KeywordRoutingConfig `yaml:"keyword_routing"`
}
type KeywordRoutingConfig struct {
Enabled bool `yaml:"enabled"`
Rules []KeywordRule `yaml:"rules"`
}
type KeywordRule struct {
Name string `yaml:"name"`
Description string `yaml:"description"`
Keywords KeywordSet `yaml:"keywords"`
CandidateModels []string `yaml:"candidate_models"`
Priority int `yaml:"priority"`
}
type KeywordSet struct {
Operator string `yaml:"operator"` // "AND" | "OR"
CaseSensitive bool `yaml:"case_sensitive"`
Terms []string `yaml:"terms"`
}
3. Integration with Router (pkg/extproc/request_handler.go
)
func (r *OpenAIRouter) handleModelRouting(...) (*ext_proc.ProcessingResponse, error) {
// ... existing code ...
if originalModel == "auto" {
var selectedModel string
// Use keyword routing if enabled
if r.Config.KeywordRouting.Enabled {
matchResults := r.KeywordMatcher.MatchQuery(userContent)
if len(matchResults) > 0 {
// Log matched rules
for _, result := range matchResults {
observability.Infof("Keyword match: rule=%s, keywords=%v, models=%v",
result.RuleName, result.MatchedKeywords, result.CandidateModels)
}
// Get candidate models
candidates := r.KeywordMatcher.GetCandidateModels(matchResults)
// Select best model from candidates using category scores
selectedModel = r.Classifier.SelectBestModelFromList(userContent, candidates)
// Record metrics
metrics.RecordKeywordRouting(matchResults, selectedModel)
} else {
// Fallback to category-only routing
selectedModel = r.classifyAndSelectBestModel(userContent)
}
} else {
// Existing category-only routing
selectedModel = r.classifyAndSelectBestModel(userContent)
}
matchedModel = selectedModel
}
// ... rest of the code ...
}
Additional context
Benefits:
- Performance: Keyword matching is ~10-20x faster than ML inference (1-2ms vs 20-30ms)
- Deterministic: Predictable routing based on explicit rules
- Domain Knowledge: Leverage business logic and domain expertise
- Flexible: Support complex boolean logic (AND/OR) and case sensitivity
- Backward Compatible: Existing category-only routing continues to work
- Easy to Configure: No ML training required, just YAML configuration
Use Cases:
- Technology-Specific Routing: Route Kubernetes queries to k8s experts, SQL queries to database specialists
- Security-Critical Queries: Route CVE IDs and security terms to hardened models
- Compliance Requirements: Route compliance-related keywords to specialized compliance models
- Performance Optimization: Pre-filter queries by keywords to reduce ML inference overhead
- Multi-Tenant Routing: Route queries based on tenant-specific keywords
Performance Characteristics:
Operation | Latency | Scalability |
---|---|---|
Keyword Matching | ~1-2ms | O(n × m) where n=rules, m=keywords |
Category Classification | ~20-30ms | O(1) model inference |
Combined (Keyword + Category) | ~22-32ms | Keyword filters, then classify |
Routing Flow Diagram:
sequenceDiagram
participant User
participant Router
participant KM as KeywordMatcher
participant Classifier
User->>Router: Query: "How to secure K8s?"
Router->>KM: MatchQuery(query)
KM->>KM: Check Rule 1: kubernetes-infrastructure
KM->>KM: Match "k8s" → ✓
KM->>KM: Check Rule 2: database-operations
KM->>KM: No match → ✗
KM->>KM: Check Rule 3: security-critical
KM->>KM: Match "secure" → ✓
KM-->>Router: Results: [Rule 1, Rule 3]
Router->>Router: Combine candidates: [k8s-expert, devops-model, security-model]
Router->>Classifier: SelectBestModelFromList(query, candidates)
Classifier->>Classifier: Run category classification
Classifier->>Classifier: Category: "computer science"
Classifier->>Classifier: Check scores for candidates
Classifier-->>Router: Best model: k8s-expert (score: 0.95)
Router-->>User: Route to: k8s-expert
Testing Requirements:
-
Unit Tests:
- AND operator with all keywords present
- AND operator with missing keywords
- OR operator with at least one keyword
- OR operator with no keywords
- Case-sensitive matching
- Case-insensitive matching
- Priority-based rule ordering
- Multiple rule matching
- Deduplication of candidate models
-
Integration Tests:
- Keyword routing with category fallback
- Keyword routing with model selection
- Disabled keyword routing (fallback to category-only)
-
E2E Tests:
- Real queries with expected routing decisions
- Performance benchmarks
Implementation Phases:
Phase 1: Core Implementation (1-2 weeks)
- Implement KeywordMatcher with AND/OR logic
- Add configuration structures
- Add unit tests for keyword matching
Phase 2: Integration (1 week)
- Integrate with OpenAIRouter
- Add fallback logic to category-only routing
- Add integration tests
Phase 3: Observability (1 week)
- Add metrics for keyword matching
- Add detailed logging
- Add performance benchmarks
Related Files:
src/semantic-router/pkg/config/config.go
- Configuration structuressrc/semantic-router/pkg/extproc/request_handler.go
- Request routing logicsrc/semantic-router/pkg/utils/keyword/matcher.go
- Keyword matcher (NEW)config/config.yaml
- Configuration example
Future Enhancements:
- Support regex patterns for advanced matching
- Support negative keywords (e.g., "NOT contains X")
- Support keyword weighting/scoring
- Support phrase matching (multi-word keywords)
- Support stemming and lemmatization for better matching
- Add keyword synonym support
Observability & Metrics:
# New metrics for keyword routing
keyword_routing_matches_total{rule_name, matched} # Counter of rule matches
keyword_routing_duration_seconds{rule_name} # Histogram of matching latency
keyword_routing_candidates_count{rule_name} # Histogram of candidate counts
keyword_routing_fallback_total # Counter of fallbacks to category-only
Example Logging:
[INFO] Keyword routing enabled with 4 rules
[INFO] Query: "How to secure a Kubernetes cluster with RBAC?"
[INFO] Keyword match: rule=kubernetes-infrastructure, keywords=[kubernetes], priority=100, models=[k8s-expert, devops-model, cloud-native-model]
[INFO] Keyword match: rule=security-critical, keywords=[secure], priority=95, models=[security-hardened-model, compliance-model]
[INFO] Combined candidates: [k8s-expert, devops-model, cloud-native-model, security-hardened-model, compliance-model]
[INFO] Category classification: computer science (confidence: 0.85)
[INFO] Best model from candidates: k8s-expert (score: 0.95)
[INFO] Final selection: k8s-expert
[INFO] Keyword routing latency: 1.2ms