Skip to content

Commit 92385f0

Browse files
committed
project: clean-up and improve docs
Signed-off-by: bitliu <[email protected]>
1 parent 9cd4514 commit 92385f0

File tree

3 files changed

+57
-113
lines changed

3 files changed

+57
-113
lines changed

CONTRIBUTING.md

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,24 @@ make build
103103
make test-jailbreak-classifier
104104
```
105105

106+
### Manual Testing
107+
108+
Test different routing scenarios:
109+
110+
```bash
111+
# Test model auto-selection
112+
make test-prompt
113+
114+
# Test PII detection
115+
make test-pii
116+
117+
# Test prompt guard (jailbreak detection)
118+
make test-prompt-guard
119+
120+
# Test tools auto-selection
121+
make test-tools
122+
```
123+
106124
### End-to-End Tests
107125

108126
Ensure both Envoy and the router are running, then:
@@ -121,23 +139,14 @@ python e2e-tests/run_all_tests.py --pattern "0*-*.py"
121139
python e2e-tests/run_all_tests.py --check-only
122140
```
123141

124-
### Manual Testing
142+
The test suite includes:
125143

126-
Test different routing scenarios:
127-
128-
```bash
129-
# Test model auto-selection
130-
make test-prompt
131-
132-
# Test PII detection
133-
make test-pii
134-
135-
# Test prompt guard (jailbreak detection)
136-
make test-prompt-guard
137-
138-
# Test tools auto-selection
139-
make test-tools
140-
```
144+
+ Basic client request tests
145+
+ Envoy ExtProc interaction tests
146+
+ Router classification tests
147+
+ Semantic cache tests
148+
+ Category-specific tests
149+
+ Metrics validation tests
141150

142151
## Development Workflow
143152

OWNER

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
# Root directory owners
1+
# Root directory Owners
22
@rootfs
33
@Xunzhuo

README.md

Lines changed: 31 additions & 96 deletions
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,46 @@
1515

1616
## Overview
1717

18+
```mermaid
19+
graph TB
20+
Client[Client Request] --> Router[vLLM Semantic Router]
21+
22+
subgraph "Intent Understanding"
23+
direction LR
24+
PII[PII Detector]
25+
Jailbreak[Jailbreak Guard]
26+
Category[Category Classifier]
27+
Cache[Semantic Cache]
28+
end
29+
30+
Router --> PII
31+
Router --> Jailbreak
32+
Router --> Category
33+
Router --> Cache
34+
35+
PII --> Decision{Security Check}
36+
Jailbreak --> Decision
37+
Decision -->|Block| Block[Block Request]
38+
Decision -->|Pass| Category
39+
Category --> Models[Route to Specialized Model]
40+
Cache -->|Hit| FastResponse[Return Cached Response]
41+
42+
Models --> Math[Math Model]
43+
Models --> Creative[Creative Model]
44+
Models --> Code[Code Model]
45+
Models --> General[General Model]
46+
```
47+
1848
### Auto-Selection of Models
1949

20-
An **Mixture-of-Models** (MoM) router that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on **Semantic Understanding** of the request's intent.
50+
An **Mixture-of-Models** (MoM) router that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on **Semantic Understanding** of the request's intent (Complexity, Task, Tools).
2151

2252
This is achieved using BERT classification. Conceptually similar to Mixture-of-Experts (MoE) which lives *within* a model, this system selects the best *entire model* for the nature of the task.
2353

2454
As such, the overall inference accuracy is improved by using a pool of models that are better suited for different types of tasks:
2555

2656
![Model Accuracy](./docs/category_accuracies.png)
2757

28-
The detailed design doc can be found [here](https://docs.google.com/document/d/1BwwRxdf74GuCdG1veSApzMRMJhXeUWcw0wH9YRAmgGw/edit?usp=sharing).
29-
3058
The screenshot below shows the LLM Router dashboard in Grafana.
3159

3260
![LLM Router Dashboard](./docs/grafana_screenshot.png)
@@ -61,96 +89,3 @@ The documentation includes:
6189
- **[System Architecture](https://llm-semantic-router.readthedocs.io/en/latest/architecture/system-architecture/)** - Technical deep dive
6290
- **[Model Training](https://llm-semantic-router.readthedocs.io/en/latest/training/training-overview/)** - How classification models work
6391
- **[API Reference](https://llm-semantic-router.readthedocs.io/en/latest/api/router/)** - Complete API documentation
64-
65-
## Quick Usage
66-
67-
### Prerequisites
68-
69-
- Rust
70-
- Envoy
71-
- Huggingface CLI
72-
73-
### Run the Envoy Proxy
74-
75-
This listens for incoming requests and uses the ExtProc filter.
76-
```bash
77-
make run-envoy
78-
```
79-
80-
### Download the models
81-
82-
```bash
83-
make download-models
84-
```
85-
86-
### Run the Semantic Router (Go Implementation)
87-
88-
This builds the Rust binding and the Go router, then starts the ExtProc gRPC server that Envoy communicates with.
89-
```bash
90-
make run-router
91-
```
92-
93-
Once both Envoy and the router are running, you can test the routing logic using predefined prompts:
94-
95-
```bash
96-
# Test the tools auto-selection
97-
make test-tools
98-
99-
# Test the auto-selection of model
100-
make test-prompt
101-
102-
# Test the prompt guard
103-
make test-prompt-guard
104-
105-
# Test the PII detection
106-
make test-pii
107-
```
108-
109-
This will send curl requests simulating different types of user prompts (Math, Creative Writing, General) to the Envoy endpoint (`http://localhost:8801`). The router should direct these to the appropriate backend model configured in `config/config.yaml`.
110-
111-
## Testing
112-
113-
A comprehensive test suite is available to validate the functionality of the Semantic Router. The tests follow the data flow through the system, from client request to routing decision.
114-
115-
### Prerequisites
116-
117-
Install test dependencies:
118-
```bash
119-
pip install -r tests/requirements.txt
120-
```
121-
122-
### Running Tests
123-
124-
Make sure both the Envoy proxy and Router are running:
125-
```bash
126-
make run-envoy # In one terminal
127-
make run-router # In another terminal
128-
```
129-
### Running e2e Tests
130-
Run all tests in sequence:
131-
```bash
132-
python e2e-tests/run_all_tests.py
133-
```
134-
135-
Run a specific test:
136-
```bash
137-
python e2e-tests/00-client-request-test.py
138-
```
139-
140-
Run only tests matching a pattern:
141-
```bash
142-
python e2e-tests/run_all_tests.py --pattern "0*-*.py"
143-
```
144-
145-
Check if services are running without running tests:
146-
```bash
147-
python e2e-tests/run_all_tests.py --check-only
148-
```
149-
150-
The test suite includes:
151-
- Basic client request tests
152-
- Envoy ExtProc interaction tests
153-
- Router classification tests
154-
- Semantic cache tests
155-
- Category-specific tests
156-
- Metrics validation tests

0 commit comments

Comments
 (0)