Skip to content

Commit bee0c19

Browse files
committed
project: add citation and slack channel
Signed-off-by: bitliu <[email protected]>
1 parent cbd5eb5 commit bee0c19

File tree

6 files changed

+33
-1810
lines changed

6 files changed

+33
-1810
lines changed

README.md

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,13 @@
1313

1414
</div>
1515

16-
## Overview
16+
## Innovations ✨
1717

1818
![](./website/static/img/architecture.png)
1919

20-
### Auto-Reasoning and Auto-Selection of Models
20+
### Intelligent Routing 🧠
21+
22+
#### Auto-Reasoning and Auto-Selection of Models
2123

2224
An **Mixture-of-Models** (MoM) router that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on **Semantic Understanding** of the request's intent (Complexity, Task, Tools).
2325

@@ -33,31 +35,49 @@ The screenshot below shows the LLM Router dashboard in Grafana.
3335

3436
The router is implemented in two ways: Golang (with Rust FFI based on Candle) and Python. Benchmarking will be conducted to determine the best implementation.
3537

36-
### Auto-Selection of Tools
38+
#### Auto-Selection of Tools
3739

3840
Select the tools to use based on the prompt, avoiding the use of tools that are not relevant to the prompt so as to reduce the number of prompt tokens and improve tool selection accuracy by the LLM.
3941

40-
### PII detection
42+
### Enterprise Security 🔒
43+
44+
#### PII detection
4145

4246
Detect PII in the prompt, avoiding sending PII to the LLM so as to protect the privacy of the user.
4347

44-
### Prompt guard
48+
#### Prompt guard
4549

4650
Detect if the prompt is a jailbreak prompt, avoiding sending jailbreak prompts to the LLM so as to prevent the LLM from misbehaving.
4751

48-
### Semantic Caching
52+
### Similarity Caching ⚡️
4953

5054
Cache the semantic representation of the prompt so as to reduce the number of prompt tokens and improve the overall inference latency.
5155

52-
## 📖 Documentation
56+
## Documentation 📖
5357

5458
For comprehensive documentation including detailed setup instructions, architecture guides, and API references, visit:
5559

5660
**👉 [Complete Documentation at Read the Docs](https://vllm-semantic-router.com/)**
5761

5862
The documentation includes:
5963
- **[Installation Guide](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Complete setup instructions
60-
- **[Quick Start](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Get running in 5 minutes
6164
- **[System Architecture](https://vllm-semantic-router.com/docs/architecture/system-architecture/)** - Technical deep dive
6265
- **[Model Training](https://vllm-semantic-router.com/docs/training/training-overview/)** - How classification models work
6366
- **[API Reference](https://vllm-semantic-router.com/docs/api/router/)** - Complete API documentation
67+
68+
## Community 👋
69+
70+
For questions, feedback, or to contribute, please join `#semantic-router` channel in vLLM Slack.
71+
72+
## Citation
73+
74+
If you find Semantic Router helpful in your research or projects, please consider citing it:
75+
76+
```
77+
@misc{semanticrouter2025,
78+
title={vLLM Semantic Router},
79+
author={vLLM Semantic Router Team},
80+
year={2025},
81+
howpublished={\url{https://github.com/vllm-project/semantic-router}},
82+
}
83+
```

docker/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ This Docker Compose configuration allows you to quickly run Semantic Router + En
55
## Prerequisites
66

77
- Docker and Docker Compose
8-
- Ensure ports 8801, 50051, 19000, and 60000 are not in use
8+
- Ensure ports 8801, 50051, 19000 are not in use
99

1010
## Install in Docker Compose
1111

@@ -40,7 +40,6 @@ This Docker Compose configuration allows you to quickly run Semantic Router + En
4040
- Semantic Router: http://localhost:50051 (gRPC service)
4141
- Envoy Proxy: http://localhost:8801 (main endpoint)
4242
- Envoy Admin: http://localhost:19000 (admin interface)
43-
- Mock vLLM (testing): http://localhost:60000 (if using testing profile)
4443

4544
## Quick Start
4645

website/docs/getting-started/docker-compose.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,15 @@ This guide shows you how to quickly set up and run Semantic Router with Envoy us
1313
### 1. Clone the Repository
1414

1515
```bash
16-
git clone https://github.com/your-org/semantic-router.git
16+
git clone https://github.com/vllm-project/semantic-router.git
1717
cd semantic-router
1818
```
1919

2020
### 2. Download Models (Optional but Recommended)
2121

2222
```bash
23-
# Install HuggingFace CLI if not already installed
24-
pip install huggingface_hub
23+
# Install Packages
24+
pip install -r requirements.txt
2525

2626
# Download pre-trained models
2727
make download-models

website/docs/getting-started/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This guide will help you set up and install the Semantic Router on your system.
1717
### 1. Clone the Repository
1818

1919
```bash
20-
git clone https://github.com/your-org/semantic-router.git
20+
git clone https://github.com/vllm-project/semantic-router.git
2121
cd semantic-router
2222
```
2323

0 commit comments

Comments
 (0)