Skip to content

Commit e819766

Browse files
yossiovadiaclaude
andcommitted
Add revived e2e tests and updates
- Add new e2e test files: jailbreak, pii-policy, tools, model-selection, metrics, error-handling tests - Update existing e2e tests: client-request, envoy-extproc, router-classification, cache tests - Add CLAUDE.md with project documentation and instructions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
1 parent 14cb752 commit e819766

11 files changed

+3159
-15
lines changed

CLAUDE.md

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
vLLM Semantic Router is a intelligent routing system that uses BERT-based semantic classification to select the optimal model for LLM requests. The system consists of a Rust library for ML inference and a Go service implementing the Envoy ExtProc interface for request routing.
8+
9+
## Build Commands
10+
11+
### Essential Build Commands
12+
```bash
13+
# Build everything (Rust library + Go router)
14+
make build
15+
16+
# Build only Rust library (candle-binding)
17+
make rust
18+
19+
# Build only Go router
20+
make build-router
21+
22+
# Download required models from Hugging Face
23+
make download-models
24+
25+
# Clean all build artifacts
26+
make clean
27+
```
28+
29+
### Running the System
30+
```bash
31+
# Run the semantic router (requires models downloaded)
32+
make run-router
33+
34+
# Run Envoy proxy (separate terminal)
35+
make run-envoy
36+
37+
# Use custom config file
38+
CONFIG_FILE=custom.yaml make run-router
39+
```
40+
41+
## Testing Commands
42+
43+
### Core Testing
44+
```bash
45+
# Run all tests (includes go vet, go mod tidy checks, and unit tests)
46+
make test
47+
48+
# Test individual components
49+
make test-binding # Test Rust bindings
50+
make test-semantic-router # Test Go router
51+
make test-category-classifier # Test category classification
52+
make test-pii-classifier # Test PII detection
53+
make test-jailbreak-classifier # Test jailbreak detection
54+
```
55+
56+
### Manual Testing (requires services running)
57+
```bash
58+
# Test different routing scenarios
59+
make test-auto-prompt-reasoning # Test reasoning mode
60+
make test-auto-prompt-no-reasoning # Test normal mode
61+
make test-pii # Test PII detection
62+
make test-prompt-guard # Test jailbreak detection
63+
make test-tools # Test tool auto-selection
64+
```
65+
66+
### Milvus Cache Testing
67+
```bash
68+
# Start Milvus container
69+
make start-milvus
70+
71+
# Test with Milvus backend
72+
make test-milvus-cache
73+
make test-semantic-router-milvus
74+
75+
# Stop Milvus when done
76+
make stop-milvus
77+
```
78+
79+
### End-to-End Testing
80+
```bash
81+
# Start services first
82+
make run-envoy &
83+
make run-router &
84+
85+
# Run comprehensive e2e tests
86+
python e2e-tests/run_all_tests.py
87+
88+
# Run specific tests
89+
python e2e-tests/00-client-request-test.py
90+
```
91+
92+
## Code Quality
93+
94+
### Pre-commit Hooks
95+
```bash
96+
# Install pre-commit hooks (mandatory for contributions)
97+
pip install pre-commit
98+
pre-commit install
99+
100+
# Run all pre-commit checks
101+
pre-commit run --all-files
102+
```
103+
104+
### Go Module Management
105+
```bash
106+
# Keep Go modules tidy (checked by CI)
107+
cd candle-binding && go mod tidy
108+
cd src/semantic-router && go mod tidy
109+
```
110+
111+
## Architecture
112+
113+
### High-Level Components
114+
- **Candle Binding**: Rust library using the [candle](https://github.com/huggingface/candle) ML framework for BERT-based classification
115+
- **Semantic Router**: Go service implementing Envoy ExtProc interface for intelligent request routing
116+
- **Configuration**: YAML-based configuration for models, endpoints, and routing rules
117+
118+
### Core Classification Models
119+
- **Category Classifier**: Routes requests to appropriate models based on content domain (math, science, law, etc.)
120+
- **PII Classifier**: Detects and blocks personally identifiable information
121+
- **Jailbreak Classifier**: Identifies and blocks prompt injection attempts
122+
123+
### Semantic Caching
124+
- **Memory Backend**: Fast in-memory cache for development
125+
- **Milvus Backend**: Scalable vector database for production deployments
126+
127+
### Directory Structure
128+
```
129+
├── candle-binding/ # Rust ML library with BERT classification
130+
├── src/semantic-router/ # Go router service (Envoy ExtProc)
131+
├── src/training/ # Model training and fine-tuning scripts
132+
├── config/ # Configuration files (config.yaml, etc.)
133+
├── e2e-tests/ # End-to-end test suite
134+
├── models/ # Downloaded classification models
135+
└── website/ # Documentation website
136+
```
137+
138+
### Key Configuration Files
139+
- `config/config.yaml`: Main configuration for models, endpoints, and routing rules
140+
- `config/tools_db.json`: Tool selection database
141+
- `config/cache/milvus.yaml`: Milvus vector database configuration
142+
143+
## Development Environment Setup
144+
145+
### Prerequisites
146+
- Rust (latest stable)
147+
- Go 1.24.1+
148+
- Hugging Face CLI (`pip install huggingface_hub`)
149+
- Make
150+
- Python 3.8+ (for training and e2e tests)
151+
152+
### Initial Setup
153+
```bash
154+
# Clone and download models
155+
git clone https://github.com/vllm-project/semantic-router.git
156+
cd semantic-router
157+
make download-models
158+
159+
# Install Python dependencies (optional)
160+
pip install -r requirements.txt
161+
pip install -r e2e-tests/requirements.txt
162+
```
163+
164+
## Documentation
165+
166+
### Documentation Development
167+
```bash
168+
# Start documentation dev server
169+
make docs-dev
170+
171+
# Build documentation for production
172+
make docs-build
173+
174+
# Lint documentation
175+
make docs-lint
176+
```
177+
178+
## Environment Variables
179+
180+
- `LD_LIBRARY_PATH`: Must include `${PWD}/candle-binding/target/release` for Rust library loading
181+
- `CONFIG_FILE`: Path to configuration file (default: `config/config.yaml`)
182+
- `CONTAINER_RUNTIME`: Container runtime for Milvus (`docker` or `podman`)
183+
- `VLLM_ENDPOINT`: vLLM endpoint URL for testing
184+
- `SKIP_MILVUS_TESTS`: Skip Milvus-dependent tests (default: `true`)
185+
186+
## Important Notes
187+
188+
- Always run `make download-models` before first build
189+
- The system requires both Envoy and the router to be running for end-to-end functionality
190+
- Use `make test` before submitting changes to ensure all quality checks pass
191+
- For production deployments, consider using Milvus backend for semantic caching

e2e-tests/00-client-request-test.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,13 @@
1414

1515
import requests
1616

17-
# Add parent directory to path to allow importing common test utilities
18-
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
19-
from tests.test_base import SemanticRouterTestBase
17+
# Import test base from same directory
18+
from test_base import SemanticRouterTestBase
2019

2120
# Constants
2221
ENVOY_URL = "http://localhost:8801"
2322
OPENAI_ENDPOINT = "/v1/chat/completions"
24-
DEFAULT_MODEL = "qwen2.5:32b" # Changed to match other tests
23+
DEFAULT_MODEL = "gemma3:27b" # Use configured model
2524
MAX_RETRIES = 3
2625
RETRY_DELAY = 2
2726

e2e-tests/01-envoy-extproc-test.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,18 +10,18 @@
1010
import json
1111
import os
1212
import sys
13+
import unittest
1314
import uuid
1415

1516
import requests
1617

17-
# Add parent directory to path to allow importing common test utilities
18-
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
19-
from tests.test_base import SemanticRouterTestBase
18+
# Import test base from same directory
19+
from test_base import SemanticRouterTestBase
2020

2121
# Constants
2222
ENVOY_URL = "http://localhost:8801"
2323
OPENAI_ENDPOINT = "/v1/chat/completions"
24-
DEFAULT_MODEL = "qwen2.5:32b" # Changed from gemma3:27b to match make test-prompt
24+
DEFAULT_MODEL = "gemma3:27b" # Use configured model
2525

2626

2727
class EnvoyExtProcTest(SemanticRouterTestBase):

e2e-tests/02-router-classification-test.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,19 +10,19 @@
1010
import os
1111
import sys
1212
import time
13+
import unittest
1314
from collections import defaultdict
1415

1516
import requests
1617

17-
# Add parent directory to path to allow importing common test utilities
18-
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
19-
from tests.test_base import SemanticRouterTestBase
18+
# Import test base from same directory
19+
from test_base import SemanticRouterTestBase
2020

2121
# Constants
2222
ENVOY_URL = "http://localhost:8801"
2323
OPENAI_ENDPOINT = "/v1/chat/completions"
2424
ROUTER_METRICS_URL = "http://localhost:9190/metrics"
25-
DEFAULT_MODEL = "qwen2.5:32b" # Changed from gemma3:27b to match make test-prompt
25+
DEFAULT_MODEL = "gemma3:27b" # Use configured model
2626

2727
# Category test cases - each designed to trigger a specific classifier category
2828
CATEGORY_TEST_CASES = [

0 commit comments

Comments
 (0)