-
Notifications
You must be signed in to change notification settings - Fork 296
Description
π Overview
This issue tracks the implementation of missing E2E test cases to achieve comprehensive coverage of semantic-router's core functionality. The current E2E test suite covers basic functionality, PII detection, jailbreak detection, domain classification, and semantic cache, but lacks coverage for several critical routing strategies and filters.
π― Scope
1οΈβ£ Routing Strategies (High Priority βββββ)
1.1 Keyword Routing
File: src/semantic-router/pkg/classification/keyword_classifier.go
Test Coverage Needed:
- β OR operator - any keyword matches
- β AND operator - all keywords must match
- β NOR operator - no keywords match
- β Case-sensitive vs case-insensitive matching
- β Regex pattern matching
- β Word boundary detection
- β Priority over embedding and intent-based routing
Example Test Data:
{
"test_cases": [
{
"description": "OR operator - urgent request",
"query": "I need urgent help with my account",
"expected_category": "urgent_request",
"expected_confidence": 1.0,
"matched_keywords": ["urgent"]
},
{
"description": "AND operator - sensitive data",
"query": "My SSN and credit card were stolen",
"expected_category": "sensitive_data",
"expected_confidence": 1.0,
"matched_keywords": ["SSN", "credit card"]
}
]
}Reference Config: config/intelligent-routing/in-tree/keyword.yaml
1.2 Embedding Routing
File: src/semantic-router/pkg/classification/embedding_classifier.go
Test Coverage Needed:
- β Semantic similarity matching with embeddings
- β Mean vs Max aggregation methods
- β Similarity threshold validation
- β Embedding model selection (auto, qwen3, gemma, bert)
- β Matryoshka dimensions (768, 512, 256, 128)
- β Quality vs Latency priority in auto mode
Example Test Data:
{
"test_cases": [
{
"description": "Mean aggregation - technical query",
"query": "How to implement async/await in Python?",
"keywords": ["programming", "coding", "software"],
"aggregation_method": "mean",
"threshold": 0.7,
"expected_category": "technical",
"expected_similarity": 0.85
},
{
"description": "Max aggregation - high similarity",
"query": "What is machine learning?",
"keywords": ["AI", "ML", "neural networks"],
"aggregation_method": "max",
"threshold": 0.8,
"expected_category": "ai"
}
]
}Reference Config: config/intelligent-routing/in-tree/embedding.yaml
1.3 MCP Routing (Model Context Protocol)
File: src/semantic-router/pkg/classification/mcp_classifier.go
Test Coverage Needed:
- β MCP Stdio transport (process communication)
- β MCP HTTP transport (API calls)
- β Custom classification logic via external MCP servers
- β Model and reasoning decision from MCP response
- β Fallback to in-tree classifier on MCP failure
- β
Probability distribution with
with_probabilitiesparameter
Example Test Data:
{
"test_cases": [
{
"description": "MCP stdio - regex classifier",
"mcp_server": "server_keyword.py",
"transport": "stdio",
"query": "urgent: fix production bug",
"expected_category": "urgent",
"expected_model": "gpt-oss",
"expected_use_reasoning": true
},
{
"description": "MCP HTTP - embedding classifier",
"mcp_server": "http://localhost:8080",
"transport": "http",
"query": "Explain quantum computing",
"expected_category": "science"
}
]
}Reference Servers: examples/mcp-classifier-server/
1.4 Hybrid Routing
Test Coverage Needed:
- β Priority order: Keyword β Embedding β Intent-based β MCP
- β Fallback chain when high-priority methods fail
- β Combined strategy with multiple routing methods enabled
- β Confidence fusion from multiple classifiers
Example Test Data:
{
"test_cases": [
{
"description": "Keyword takes priority over embedding",
"query": "urgent: machine learning question",
"keyword_match": "urgent_request",
"embedding_match": "ai",
"expected_category": "urgent_request",
"expected_method": "keyword"
},
{
"description": "Fallback to embedding when keyword fails",
"query": "What is deep learning?",
"keyword_match": null,
"embedding_match": "ai",
"expected_category": "ai",
"expected_method": "embedding"
}
]
}1.5 Entropy-Based Routing
File: src/semantic-router/pkg/utils/entropy/entropy.go
Test Coverage Needed:
- β Shannon entropy and normalized entropy calculation
- β Uncertainty levels: very_high, high, medium, low, very_low
- β Reasoning decision based on entropy
- β Weighted decision for high uncertainty (top-2 categories)
- β Confidence adjustment based on uncertainty
Example Test Data:
{
"test_cases": [
{
"description": "Very high entropy - enable reasoning",
"probabilities": [0.25, 0.25, 0.25, 0.25],
"expected_uncertainty": "very_high",
"expected_use_reasoning": true,
"expected_confidence": 0.3
},
{
"description": "Very low entropy - trust classification",
"probabilities": [0.95, 0.02, 0.02, 0.01],
"expected_uncertainty": "very_low",
"expected_use_reasoning": false,
"expected_confidence": 0.90
}
]
}2οΈβ£ Filter Tests (High Priority ββββ)
2.1 ReasoningControl Filter
File: src/semantic-router/pkg/extproc/req_filter_reason.go
Test Coverage Needed:
- β
Enable/disable reasoning with
enableReasoning - β Reasoning effort levels: low, medium, high
- β Reasoning families: gpt-oss, deepseek, qwen3, claude
- β
chat_template_kwargsfor different model families - β
reasoning_effortparameter (OpenAI-style) - β
maxReasoningStepslimit
Example Config:
filters:
- type: ReasoningControl
enabled: true
config:
reasonFamily: "gpt-oss"
enableReasoning: true
reasoningEffort: "high"
maxReasoningSteps: 152.2 ToolSelection Filter
Test Coverage Needed:
- β Top-K tool selection
- β Similarity threshold filtering
- β
Tools database loading from
toolsDBPath - β
Fallback strategy with
fallbackToEmpty - β Category/tag-based tool filtering
Example Config:
filters:
- type: ToolSelection
enabled: true
config:
toolsDBPath: "tools.json"
topK: 3
similarityThreshold: 0.7
fallbackToEmpty: falseReference: examples/semanticroute/tool-selection-example.yaml
2.3 Filter Chain Combination
Test Coverage Needed:
- β Multiple filter execution order
- β Filter short-circuit (e.g., PIIDetection blocks subsequent filters)
- β Filter independence (configs don't interfere)
- β Performance impact of multiple filters
Example Chain:
filters:
- type: PIIDetection
- type: PromptGuard
- type: SemanticCache
- type: ReasoningControl
- type: ToolSelection3οΈβ£ Cache Tests (Medium Priority βββ)
3.1 Different Cache Backends
File: src/semantic-router/pkg/cache/
Test Coverage Needed:
- β InMemory cache performance
- β Milvus cache with vector database
- β Hybrid cache (HNSW + Milvus)
- β TTL expiration mechanism
- β
Eviction strategy when
maxEntriesreached
3.2 Different Embedding Models for Cache
Test Coverage Needed:
- β BERT (fast, 384-dim)
- β Qwen3 (high quality, 1024-dim, 32K context)
- β Gemma (balanced, 768-dim, 8K context)
- β Matryoshka dimensions impact on cache hit rate
4οΈβ£ Performance & Concurrency Tests (Medium Priority βββ)
4.1 Concurrent Requests
Test Coverage Needed:
- β 100 concurrent classification requests
- β Thread safety of classifiers
- β Resource contention (cache, model loading)
- β QPS (queries per second) benchmarking
4.2 Long Text Handling
Test Coverage Needed:
- β 32K context with Qwen3
- β 8K context with Gemma
- β Token limit handling and truncation
5οΈβ£ Edge Cases & Error Handling (Low Priority ββ)
5.1 Configuration Errors
Test Coverage Needed:
- β Invalid category mapping
- β Invalid threshold values
- β Missing defaultModel
- β Invalid filter configurations
5.2 Network Errors
Test Coverage Needed:
- β Model service unavailable
- β MCP server timeout
- β Milvus connection failure
- β Network timeout handling
π Implementation Structure
Suggested file organization:
e2e/testcases/
βββ keyword_routing.go
βββ embedding_routing.go
βββ mcp_routing.go
βββ hybrid_routing.go
βββ entropy_routing.go
βββ reasoning_control.go
βββ tool_selection.go
βββ filter_chain.go
βββ cache_backends.go
βββ concurrent_requests.go
βββ testdata/
βββ keyword_routing_cases.json
βββ embedding_routing_cases.json
βββ mcp_routing_cases.json
βββ hybrid_routing_cases.json
βββ entropy_routing_cases.json
βββ reasoning_control_cases.json
βββ tool_selection_cases.json
βββ ...
π― Acceptance Criteria
- All test cases pass consistently
- Test data is stored in JSON files for maintainability
- Tests follow existing E2E framework patterns
- Documentation is updated with new test coverage
- CI/CD pipeline includes new tests
π References
- Existing E2E tests:
e2e/testcases/ - Keyword routing docs:
website/docs/tutorials/intelligent-route/keyword-routing.md - MCP classification docs:
website/docs/tutorials/mcp-classification/overview.md - Example configs:
examples/semanticroute/ - In-tree configs:
config/intelligent-routing/in-tree/
π€ Contributing
This is a great opportunity for new contributors! Each test case can be implemented independently. Feel free to:
- Pick any test case from the list above
- Comment on this issue to claim it
- Submit a PR with your implementation
For questions or guidance, please comment on this issue or join our community discussions.