Skip to content

Support Keyword-Based Custom Category Routing for Dynamic Model Selection  #313

@Xunzhuo

Description

@Xunzhuo

Is your feature request related to a problem? Please describe.

The current semantic routing implementation only supports category-based model selection using a fine-tuned ModernBERT classifier. This approach has critical limitations for keyword-based routing:

  1. No Deterministic Routing: Cannot route queries based on specific keywords or technology terms (e.g., "Kubernetes", "SQL", "CVE-")
  2. Slow for Simple Cases: Must run ML inference even for queries that could be routed deterministically by keyword matching
  3. No Domain-Specific Rules: Cannot define custom routing rules based on business logic or domain knowledge
  4. Limited Control: Cannot prioritize certain models based on keyword presence
  5. No Boolean Logic: Cannot combine keywords with AND/OR operators for complex matching

Real-World Scenario:
A user wants to route all Kubernetes-related queries to specialized infrastructure models without running expensive ML inference:

  • Query: "How to secure a Kubernetes cluster with RBAC?"
  • Expected: Match keywords ["kubernetes", "k8s", "RBAC"] → Route to [k8s-expert, devops-model]
  • Current: Must run ModernBERT classifier → Classify as "computer science" → Route to general models

This results in:

  • Unnecessary latency (~20-30ms for classification)
  • Less precise routing (category is too broad)
  • Cannot leverage domain knowledge (e.g., "CVE-" always goes to security models)

Describe the solution you'd like

Implement a Keyword-Based Routing system that allows users to define deterministic routing rules based on keyword matching. The solution should enable:

Core Concept: Define keyword rules in configuration that match query text and return candidate models:

keyword_routing:
  enabled: true
  rules:
    - name: "kubernetes-infrastructure"
      keywords:
        operator: "OR"
        case_sensitive: false
        terms: ["kubernetes", "k8s", "kubectl"]
      candidate_models:
        - "k8s-expert"
        - "devops-model"

Key Features:

  1. Fast Deterministic Matching: O(n) keyword matching (~1-2ms) vs O(1) ML inference (~20-30ms)
  2. Boolean Logic: Support AND/OR operators for complex keyword combinations
  3. Case Sensitivity Control: Match case-sensitive terms (e.g., "CVE-2024-1234") or case-insensitive (e.g., "kubernetes")
  4. Priority-Based Rules: Execute rules in priority order for overlapping keywords
  5. Multiple Rule Matching: Support matching multiple rules and combining their candidate models

Architecture Overview

flowchart TD
    A[User Query] --> B[KeywordMatcher]
    B --> C{Parse Rules}
    C --> D[Rule 1: Check Keywords]
    C --> E[Rule 2: Check Keywords]
    C --> F[Rule N: Check Keywords]
    
    D --> G{Match?}
    E --> H{Match?}
    F --> I{Match?}
    
    G -->|Yes| J[Candidate Models A]
    H -->|Yes| K[Candidate Models B]
    I -->|Yes| L[Candidate Models N]
    
    G -->|No| M[Skip]
    H -->|No| M
    I -->|No| M
    
    J --> N[Combine All Candidates]
    K --> N
    L --> N
    
    N --> O[Deduplicate Models]
    O --> P[Return Candidate Models]
    
    style A fill:#e1f5ff
    style P fill:#c8e6c9
    style B fill:#fff9c4
Loading

Configuration Schema

keyword_routing:
  enabled: true  # Enable/disable keyword routing
  
  rules:
    # Rule 1: Kubernetes Infrastructure
    - name: "kubernetes-infrastructure"
      description: "Route Kubernetes-related queries to infrastructure models"
      keywords:
        operator: "OR"  # Match if ANY keyword is present
        case_sensitive: false
        terms:
          - "kubernetes"
          - "k8s"
          - "kubectl"
          - "helm"
          - "pod"
          - "deployment"
          - "service"
          - "ingress"
      candidate_models:
        - "k8s-expert"
        - "devops-model"
        - "cloud-native-model"
      priority: 100  # Higher priority rules are checked first
    
    # Rule 2: Database Operations
    - name: "database-operations"
      description: "Route database queries to database specialists"
      keywords:
        operator: "AND"  # Match if ALL keywords are present
        case_sensitive: false
        terms:
          - "database"
          - "query"
      candidate_models:
        - "database-expert"
        - "sql-specialist"
      priority: 90
    
    # Rule 3: Security Critical (Case-Sensitive)
    - name: "security-critical"
      description: "Route security-related queries with CVE IDs"
      keywords:
        operator: "OR"
        case_sensitive: true  # Case-sensitive for CVE IDs
        terms:
          - "CVE-"
          - "vulnerability"
          - "exploit"
      candidate_models:
        - "security-hardened-model"
        - "compliance-model"
      priority: 95
    
    # Rule 4: Python Programming
    - name: "python-programming"
      keywords:
        operator: "OR"
        case_sensitive: false
        terms:
          - "python"
          - "pip"
          - "django"
          - "flask"
          - "pandas"
      candidate_models:
        - "python-expert"
        - "general-coding-model"
      priority: 80

Implementation Components

1. KeywordMatcher (pkg/utils/keyword/matcher.go)

package keyword

type KeywordMatcher struct {
    rules []KeywordRule
}

type KeywordRule struct {
    Name            string
    Description     string
    Keywords        KeywordSet
    CandidateModels []string
    Priority        int
}

type KeywordSet struct {
    Operator      string   // "AND" | "OR"
    CaseSensitive bool
    Terms         []string
}

type KeywordMatchResult struct {
    RuleName        string
    MatchedKeywords []string
    CandidateModels []string
    Priority        int
}

// NewKeywordMatcher creates a new keyword matcher from config
func NewKeywordMatcher(config KeywordRoutingConfig) *KeywordMatcher {
    // Sort rules by priority (descending)
    rules := sortRulesByPriority(config.Rules)
    return &KeywordMatcher{rules: rules}
}

// MatchQuery matches the query against all rules and returns matched results
func (m *KeywordMatcher) MatchQuery(query string) []KeywordMatchResult {
    results := []KeywordMatchResult{}
    
    for _, rule := range m.rules {
        if matched, keywords := m.matchRule(query, rule); matched {
            results = append(results, KeywordMatchResult{
                RuleName:        rule.Name,
                MatchedKeywords: keywords,
                CandidateModels: rule.CandidateModels,
                Priority:        rule.Priority,
            })
        }
    }
    
    return results
}

// matchRule checks if a query matches a specific rule
func (m *KeywordMatcher) matchRule(query string, rule KeywordRule) (bool, []string) {
    queryText := query
    if !rule.Keywords.CaseSensitive {
        queryText = strings.ToLower(query)
    }
    
    matchedKeywords := []string{}
    
    for _, term := range rule.Keywords.Terms {
        keyword := term
        if !rule.Keywords.CaseSensitive {
            keyword = strings.ToLower(term)
        }
        
        if strings.Contains(queryText, keyword) {
            matchedKeywords = append(matchedKeywords, term)
        }
    }
    
    // Apply operator logic
    if rule.Keywords.Operator == "AND" {
        // All keywords must match
        return len(matchedKeywords) == len(rule.Keywords.Terms), matchedKeywords
    } else {
        // At least one keyword must match (OR)
        return len(matchedKeywords) > 0, matchedKeywords
    }
}

// GetCandidateModels returns deduplicated candidate models from all matched rules
func (m *KeywordMatcher) GetCandidateModels(results []KeywordMatchResult) []string {
    modelSet := make(map[string]bool)
    
    for _, result := range results {
        for _, model := range result.CandidateModels {
            modelSet[model] = true
        }
    }
    
    models := []string{}
    for model := range modelSet {
        models = append(models, model)
    }
    
    return models
}

2. Configuration Extension (pkg/config/config.go)

type RouterConfig struct {
    // ... existing fields ...
    KeywordRouting KeywordRoutingConfig `yaml:"keyword_routing"`
}

type KeywordRoutingConfig struct {
    Enabled bool          `yaml:"enabled"`
    Rules   []KeywordRule `yaml:"rules"`
}

type KeywordRule struct {
    Name            string       `yaml:"name"`
    Description     string       `yaml:"description"`
    Keywords        KeywordSet   `yaml:"keywords"`
    CandidateModels []string     `yaml:"candidate_models"`
    Priority        int          `yaml:"priority"`
}

type KeywordSet struct {
    Operator      string   `yaml:"operator"`       // "AND" | "OR"
    CaseSensitive bool     `yaml:"case_sensitive"`
    Terms         []string `yaml:"terms"`
}

3. Integration with Router (pkg/extproc/request_handler.go)

func (r *OpenAIRouter) handleModelRouting(...) (*ext_proc.ProcessingResponse, error) {
    // ... existing code ...
    
    if originalModel == "auto" {
        var selectedModel string
        
        // Use keyword routing if enabled
        if r.Config.KeywordRouting.Enabled {
            matchResults := r.KeywordMatcher.MatchQuery(userContent)
            
            if len(matchResults) > 0 {
                // Log matched rules
                for _, result := range matchResults {
                    observability.Infof("Keyword match: rule=%s, keywords=%v, models=%v",
                        result.RuleName, result.MatchedKeywords, result.CandidateModels)
                }
                
                // Get candidate models
                candidates := r.KeywordMatcher.GetCandidateModels(matchResults)
                
                // Select best model from candidates using category scores
                selectedModel = r.Classifier.SelectBestModelFromList(userContent, candidates)
                
                // Record metrics
                metrics.RecordKeywordRouting(matchResults, selectedModel)
            } else {
                // Fallback to category-only routing
                selectedModel = r.classifyAndSelectBestModel(userContent)
            }
        } else {
            // Existing category-only routing
            selectedModel = r.classifyAndSelectBestModel(userContent)
        }
        
        matchedModel = selectedModel
    }
    
    // ... rest of the code ...
}

Additional context

Benefits:

  1. Performance: Keyword matching is ~10-20x faster than ML inference (1-2ms vs 20-30ms)
  2. Deterministic: Predictable routing based on explicit rules
  3. Domain Knowledge: Leverage business logic and domain expertise
  4. Flexible: Support complex boolean logic (AND/OR) and case sensitivity
  5. Backward Compatible: Existing category-only routing continues to work
  6. Easy to Configure: No ML training required, just YAML configuration

Use Cases:

  1. Technology-Specific Routing: Route Kubernetes queries to k8s experts, SQL queries to database specialists
  2. Security-Critical Queries: Route CVE IDs and security terms to hardened models
  3. Compliance Requirements: Route compliance-related keywords to specialized compliance models
  4. Performance Optimization: Pre-filter queries by keywords to reduce ML inference overhead
  5. Multi-Tenant Routing: Route queries based on tenant-specific keywords

Performance Characteristics:

Operation Latency Scalability
Keyword Matching ~1-2ms O(n × m) where n=rules, m=keywords
Category Classification ~20-30ms O(1) model inference
Combined (Keyword + Category) ~22-32ms Keyword filters, then classify

Routing Flow Diagram:

sequenceDiagram
    participant User
    participant Router
    participant KM as KeywordMatcher
    participant Classifier

    User->>Router: Query: "How to secure K8s?"
    Router->>KM: MatchQuery(query)

    KM->>KM: Check Rule 1: kubernetes-infrastructure
    KM->>KM: Match "k8s" → ✓
    KM->>KM: Check Rule 2: database-operations
    KM->>KM: No match → ✗
    KM->>KM: Check Rule 3: security-critical
    KM->>KM: Match "secure" → ✓

    KM-->>Router: Results: [Rule 1, Rule 3]
    Router->>Router: Combine candidates: [k8s-expert, devops-model, security-model]

    Router->>Classifier: SelectBestModelFromList(query, candidates)
    Classifier->>Classifier: Run category classification
    Classifier->>Classifier: Category: "computer science"
    Classifier->>Classifier: Check scores for candidates
    Classifier-->>Router: Best model: k8s-expert (score: 0.95)

    Router-->>User: Route to: k8s-expert
Loading

Testing Requirements:

  1. Unit Tests:

    • AND operator with all keywords present
    • AND operator with missing keywords
    • OR operator with at least one keyword
    • OR operator with no keywords
    • Case-sensitive matching
    • Case-insensitive matching
    • Priority-based rule ordering
    • Multiple rule matching
    • Deduplication of candidate models
  2. Integration Tests:

    • Keyword routing with category fallback
    • Keyword routing with model selection
    • Disabled keyword routing (fallback to category-only)
  3. E2E Tests:

    • Real queries with expected routing decisions
    • Performance benchmarks

Implementation Phases:

Phase 1: Core Implementation (1-2 weeks)

  • Implement KeywordMatcher with AND/OR logic
  • Add configuration structures
  • Add unit tests for keyword matching

Phase 2: Integration (1 week)

  • Integrate with OpenAIRouter
  • Add fallback logic to category-only routing
  • Add integration tests

Phase 3: Observability (1 week)

  • Add metrics for keyword matching
  • Add detailed logging
  • Add performance benchmarks

Related Files:

  • src/semantic-router/pkg/config/config.go - Configuration structures
  • src/semantic-router/pkg/extproc/request_handler.go - Request routing logic
  • src/semantic-router/pkg/utils/keyword/matcher.go - Keyword matcher (NEW)
  • config/config.yaml - Configuration example

Future Enhancements:

  • Support regex patterns for advanced matching
  • Support negative keywords (e.g., "NOT contains X")
  • Support keyword weighting/scoring
  • Support phrase matching (multi-word keywords)
  • Support stemming and lemmatization for better matching
  • Add keyword synonym support

Observability & Metrics:

# New metrics for keyword routing
keyword_routing_matches_total{rule_name, matched}  # Counter of rule matches
keyword_routing_duration_seconds{rule_name}        # Histogram of matching latency
keyword_routing_candidates_count{rule_name}        # Histogram of candidate counts
keyword_routing_fallback_total                     # Counter of fallbacks to category-only

Example Logging:

[INFO] Keyword routing enabled with 4 rules
[INFO] Query: "How to secure a Kubernetes cluster with RBAC?"
[INFO] Keyword match: rule=kubernetes-infrastructure, keywords=[kubernetes], priority=100, models=[k8s-expert, devops-model, cloud-native-model]
[INFO] Keyword match: rule=security-critical, keywords=[secure], priority=95, models=[security-hardened-model, compliance-model]
[INFO] Combined candidates: [k8s-expert, devops-model, cloud-native-model, security-hardened-model, compliance-model]
[INFO] Category classification: computer science (confidence: 0.85)
[INFO] Best model from candidates: k8s-expert (score: 0.95)
[INFO] Final selection: k8s-expert
[INFO] Keyword routing latency: 1.2ms

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions