Skip to content

Commit 6d3a624

Browse files
committed
docs: add local install guide
Signed-off-by: bitliu <[email protected]>
1 parent 47b83d2 commit 6d3a624

File tree

11 files changed

+488
-1216
lines changed

11 files changed

+488
-1216
lines changed

CONTRIBUTING.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,9 @@ Before you begin, ensure you have the following installed:
2121

2222
- **Rust** (latest stable version)
2323
- **Go** 1.24.1 or later
24-
- **Python** 3.8+ (for training and testing)
25-
- **Envoy Proxy**
2624
- **Hugging Face CLI** (`pip install huggingface_hub`)
2725
- **Make** (for build automation)
26+
- **Python** 3.8+ (Optiona: for training and testing)
2827

2928
### Initial Setup
3029

@@ -40,7 +39,7 @@ Before you begin, ensure you have the following installed:
4039
```
4140
This downloads the pre-trained classification models from Hugging Face.
4241

43-
3. **Install Python dependencies:**
42+
3. **Install Python dependencies(Optional):**
4443
```bash
4544
# For training and development
4645
pip install -r requirements.txt
@@ -245,7 +244,7 @@ The test suite includes:
245244

246245
## Getting Help
247246

248-
- Check the [documentation](https://llm-semantic-router.readthedocs.io/en/latest/)
247+
- Check the [documentation](https://vllm-semantic-router.com/)
249248
- Review existing issues and pull requests
250249
- Ask questions in discussions or create a new issue
251250

Makefile

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,11 @@ build-router: rust
2020
@mkdir -p bin
2121
@cd src/semantic-router && go build -o ../../bin/router cmd/main.go
2222

23-
# Run the router
23+
24+
25+
# Run the router with Classification API
2426
run-router: build-router
25-
@echo "Running router..."
27+
@echo "Running router with Classification API..."
2628
@export LD_LIBRARY_PATH=${PWD}/candle-binding/target/release && \
2729
./bin/router -config=config/config.yaml
2830

@@ -79,20 +81,41 @@ clean:
7981
cd candle-binding && cargo clean
8082
rm -f bin/router
8183

84+
# e9197711aa400477d30fe1ff07679e
85+
# sk-Hm8vN3qR7wX2pK9sL4tY6Z
86+
8287
# Test the Envoy extproc
83-
test-prompt:
88+
test-auto-prompt-reasoning:
8489
@echo "Testing Envoy extproc with curl (Math)..."
8590
curl -X POST http://localhost:8801/v1/chat/completions \
91+
-H "Authorization: Bearer e9197711aa400477d30fe1ff07679e" \
8692
-H "Content-Type: application/json" \
87-
-d '{"model": "auto", "messages": [{"role": "assistant", "content": "You are a professional math teacher. Explain math concepts clearly and show step-by-step solutions to problems."}, {"role": "user", "content": "What is the derivative of f(x) = x^3 + 2x^2 - 5x + 7?"}], "temperature": 0.7}'
88-
@echo "Testing Envoy extproc with curl (Creative Writing)..."
93+
-d '{"model": "auto", "messages": [{"role": "system", "content": "You are a professional math teacher. Explain math concepts clearly and show step-by-step solutions to problems."}, {"role": "user", "content": "What is the derivative of f(x) = x^3 + 2x^2 - 5x + 7?"}]}'
94+
95+
# Test the Envoy extproc
96+
test-auto-prompt-no-reasoning:
97+
@echo "Testing Envoy extproc with curl (Math)..."
8998
curl -X POST http://localhost:8801/v1/chat/completions \
99+
-H "Authorization: Bearer e9197711aa400477d30fe1ff07679e" \
90100
-H "Content-Type: application/json" \
91-
-d '{"model": "auto", "messages": [{"role": "assistant", "content": "You are a story writer. Create interesting stories with good characters and settings."}, {"role": "user", "content": "Write a short story about a space cat."}], "temperature": 0.7}'
92-
@echo "Testing Envoy extproc with curl (Default/General)..."
93-
curl -X POST http://localhost:8801/v1/chat/completions \
101+
-d '{"model": "auto", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who are you?"}]}'
102+
103+
# Test the Envoy extproc
104+
test-prompt-deepseekv3-thinking:
105+
@echo "Testing Envoy extproc with curl (Math)..."
106+
curl -i -v -X POST http://localhost:8801/v1/chat/completions \
107+
-H "Authorization: Bearer e9197711aa400477d30fe1ff07679e" \
94108
-H "Content-Type: application/json" \
95-
-d '{"model": "auto", "messages": [{"role": "assistant", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"}], "temperature": 0.7}'
109+
-d '{"model": "deepseek-v31", "chat_template_kwargs": { "thinking": true }, "messages": [{"role": "system", "content": "You are a professional math teacher. Explain math concepts clearly and show step-by-step solutions to problems."}, {"role": "user", "content": "What is the derivative of f(x) = x^3 + 2x^2 - 5x + 7?"}]}'
110+
111+
# Test the Envoy extproc
112+
test-prompt-deepseekv3-no-thinking:
113+
@echo "Testing Envoy extproc with curl (Math)..."
114+
curl -i -v -X POST http://localhost:8801/v1/chat/completions \
115+
-H "Authorization: Bearer e9197711aa400477d30fe1ff07679e" \
116+
-H "Content-Type: application/json" \
117+
-d '{"model": "deepseek-v31", "chat_template_kwargs": { "thinking": false }, "messages": [{"role": "system", "content": "You are a professional math teacher. Explain math concepts clearly and show step-by-step solutions to problems."}, {"role": "user", "content": "What is the derivative of f(x) = x^3 + 2x^2 - 5x + 7?"}]}'
118+
96119
# Test prompts that contain PII
97120
test-pii:
98121
@echo "Testing Envoy extproc with curl (Credit card number)..."

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22

33
<img src="website/static/img/repo.png" alt="vLLM Semantic Router"/>
44

5-
[![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue)](https://llm-semantic-router.readthedocs.io/en/latest/)
5+
[![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue)](https://vllm-semantic-router.com)
66
[![Hugging Face](https://img.shields.io/badge/🤗%20Hugging%20Face-Community-yellow)](https://huggingface.co/LLM-Semantic-Router)
77
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
88
[![Crates.io](https://img.shields.io/crates/v/candle-semantic-router.svg)](https://crates.io/crates/candle-semantic-router)
99

10-
**📚 [Complete Documentation](https://llm-semantic-router.readthedocs.io/en/latest/) | 🚀 [Quick Start](https://llm-semantic-router.readthedocs.io/en/latest/getting-started/quick-start/) | 🏗️ [Architecture](https://llm-semantic-router.readthedocs.io/en/latest/architecture/system-architecture/) | 📖 [API Reference](https://llm-semantic-router.readthedocs.io/en/latest/api/router/)**
10+
**📚 [Complete Documentation](https://vllm-semantic-router.com) | 🚀 [Quick Start](https://vllm-semantic-router.com/docs/getting-started/installation) | 🏗️ [Architecture](https://vllm-semantic-router.com/docs/architecture/system-architecture/) | 📖 [API Reference](https://vllm-semantic-router.com/docs/api/router/)**
1111

1212
![](./website/static/img/code.png)
1313

@@ -53,11 +53,11 @@ Cache the semantic representation of the prompt so as to reduce the number of pr
5353

5454
For comprehensive documentation including detailed setup instructions, architecture guides, and API references, visit:
5555

56-
**👉 [Complete Documentation at Read the Docs](https://llm-semantic-router.readthedocs.io/en/latest/)**
56+
**👉 [Complete Documentation at Read the Docs](https://vllm-semantic-router.com/)**
5757

5858
The documentation includes:
59-
- **[Installation Guide](https://llm-semantic-router.readthedocs.io/en/latest/getting-started/installation/)** - Complete setup instructions
60-
- **[Quick Start](https://llm-semantic-router.readthedocs.io/en/latest/getting-started/quick-start/)** - Get running in 5 minutes
61-
- **[System Architecture](https://llm-semantic-router.readthedocs.io/en/latest/architecture/system-architecture/)** - Technical deep dive
62-
- **[Model Training](https://llm-semantic-router.readthedocs.io/en/latest/training/training-overview/)** - How classification models work
63-
- **[API Reference](https://llm-semantic-router.readthedocs.io/en/latest/api/router/)** - Complete API documentation
59+
- **[Installation Guide](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Complete setup instructions
60+
- **[Quick Start](https://vllm-semantic-router.com/docs/getting-started/quick-start/)** - Get running in 5 minutes
61+
- **[System Architecture](https://vllm-semantic-router.com/docs/architecture/system-architecture/)** - Technical deep dive
62+
- **[Model Training](https://vllm-semantic-router.com/docs/training/training-overview/)** - How classification models work
63+
- **[API Reference](https://vllm-semantic-router.com/docs/api/router/)** - Complete API documentation

website/.docusaurus/client-manifest.json

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -375,9 +375,9 @@
375375
"1164": {
376376
"js": [
377377
{
378-
"file": "assets/js/2f3f46e5.f86819b2.js",
379-
"hash": "5a77916e940717e3",
380-
"publicPath": "/assets/js/2f3f46e5.f86819b2.js"
378+
"file": "assets/js/2f3f46e5.6d1dae68.js",
379+
"hash": "919c825a6bd8feb4",
380+
"publicPath": "/assets/js/2f3f46e5.6d1dae68.js"
381381
}
382382
]
383383
},
@@ -492,9 +492,9 @@
492492
"2634": {
493493
"js": [
494494
{
495-
"file": "assets/js/c4f5d8e4.b7348ab3.js",
496-
"hash": "cb6219784beae8ad",
497-
"publicPath": "/assets/js/c4f5d8e4.b7348ab3.js"
495+
"file": "assets/js/c4f5d8e4.f8ace742.js",
496+
"hash": "bc380a5416bb30a6",
497+
"publicPath": "/assets/js/c4f5d8e4.f8ace742.js"
498498
}
499499
]
500500
},
@@ -528,9 +528,9 @@
528528
"3253": {
529529
"js": [
530530
{
531-
"file": "assets/js/ca183480.bfb760f1.js",
532-
"hash": "dfb7903bee6e8fb1",
533-
"publicPath": "/assets/js/ca183480.bfb760f1.js"
531+
"file": "assets/js/ca183480.627344bc.js",
532+
"hash": "15d8b888bf16b004",
533+
"publicPath": "/assets/js/ca183480.627344bc.js"
534534
}
535535
]
536536
},
@@ -591,9 +591,9 @@
591591
"4324": {
592592
"js": [
593593
{
594-
"file": "assets/js/588bd741.01c8b76d.js",
595-
"hash": "407bf0a9be50fd70",
596-
"publicPath": "/assets/js/588bd741.01c8b76d.js"
594+
"file": "assets/js/588bd741.fecec436.js",
595+
"hash": "556b5e775305a95c",
596+
"publicPath": "/assets/js/588bd741.fecec436.js"
597597
}
598598
]
599599
},
@@ -645,9 +645,9 @@
645645
"5024": {
646646
"js": [
647647
{
648-
"file": "assets/js/9aaa5480.8b1e2834.js",
649-
"hash": "684034423c6a370c",
650-
"publicPath": "/assets/js/9aaa5480.8b1e2834.js"
648+
"file": "assets/js/9aaa5480.38156aea.js",
649+
"hash": "192d409d40414aed",
650+
"publicPath": "/assets/js/9aaa5480.38156aea.js"
651651
}
652652
]
653653
},
@@ -663,9 +663,9 @@
663663
"5354": {
664664
"js": [
665665
{
666-
"file": "assets/js/runtime~main.9ae70a89.js",
667-
"hash": "fd9cd34fb19d0d1d",
668-
"publicPath": "/assets/js/runtime~main.9ae70a89.js"
666+
"file": "assets/js/runtime~main.46ee5a72.js",
667+
"hash": "1c4d34f445433a42",
668+
"publicPath": "/assets/js/runtime~main.46ee5a72.js"
669669
}
670670
]
671671
},
@@ -888,9 +888,9 @@
888888
"8792": {
889889
"js": [
890890
{
891-
"file": "assets/js/main.8144efe4.js",
892-
"hash": "654f0b440b68cbfa",
893-
"publicPath": "/assets/js/main.8144efe4.js"
891+
"file": "assets/js/main.74135e71.js",
892+
"hash": "80981465b58ce032",
893+
"publicPath": "/assets/js/main.74135e71.js"
894894
}
895895
]
896896
},

website/.docusaurus/docusaurus.config.mjs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
*/
66
export default {
77
"title": "vLLM Semantic Router",
8-
"tagline": "Intelligent Mixture-of-Models Router for Efficient LLM Inference",
8+
"tagline": "Intelligent Auto Reasoning Router for Efficient LLM Inference on Mixture-of-Models",
99
"favicon": "img/vllm.png",
1010
"url": "https://your-docusaurus-test-site.com",
1111
"baseUrl": "/",

website/docs/api/router.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ The router provides health check endpoints for monitoring:
124124

125125
### Router Health
126126

127-
**Endpoint:** `GET http://localhost:50051/health`
127+
**Endpoint:** `GET http://localhost:8080/health`
128128

129129
```json
130130
{

0 commit comments

Comments
 (0)