Commit 54b0ca1
committed
feat(router): add LLM routing with cost optimization and pretrained configs
Extend SemanticRouter to support LLM model selection by adding optional
model, confidence, cost optimization, and multi-match capabilities to
the existing routing infrastructure.
When a Route includes a `model` field, the router returns the LiteLLM-
compatible model identifier alongside the match, with a confidence score
derived from vector distance. Cost-optimized routing biases toward
cheaper models when semantic distances are close, using a configurable
cost_weight penalty.
Key additions to SemanticRouter:
- Route.model (optional) for LiteLLM model identifiers
- RouteMatch.confidence, .alternatives, .metadata fields
- RoutingConfig.cost_optimization and .cost_weight settings
- RoutingConfig.default_route for fallback when no match found
- from_pretrained() to load routers with pre-computed embeddings
- export_with_embeddings() to serialize routers with vectors
- AsyncSemanticRouter with full async parity
A built-in "default" pretrained config ships with 3 tiers (simple,
standard, expert) mapped to GPT-4.1 Nano, Claude Sonnet 4.5, and
Claude Opus 4.5, using pre-computed sentence-transformers embeddings.
Backward compatibility:
- LLMRouter/AsyncLLMRouter provided as deprecated wrappers
- ModelTier subclass enforces required model field
- Legacy field names (tiers/default_tier) mapped bidirectionally
- Existing SemanticRouter usage is fully unaffected
Includes integration tests, unit tests for schema validation,
a user guide notebook, and a pretrained config generation script.1 parent 5601ef0 commit 54b0ca1
File tree
14 files changed
+44549
-48
lines changed- docs/user_guide
- redisvl
- extensions/router
- pretrained
- query
- utils
- schemas
- scripts
- tests
- integration
- unit
14 files changed
+44549
-48
lines changedLarge diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | | - | |
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
3 | 11 | | |
4 | | - | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
0 commit comments