Skip to content

Commit a9c0090

Browse files
committed
project: add citation and slack channel
Signed-off-by: bitliu <[email protected]>
1 parent cbd5eb5 commit a9c0090

File tree

4 files changed

+29
-1806
lines changed

4 files changed

+29
-1806
lines changed

README.md

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,13 @@
1313

1414
</div>
1515

16-
## Overview
16+
## Innovations ✨
1717

1818
![](./website/static/img/architecture.png)
1919

20-
### Auto-Reasoning and Auto-Selection of Models
20+
### Intelligent Routing 🧠
21+
22+
#### Auto-Reasoning and Auto-Selection of Models
2123

2224
An **Mixture-of-Models** (MoM) router that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on **Semantic Understanding** of the request's intent (Complexity, Task, Tools).
2325

@@ -33,31 +35,49 @@ The screenshot below shows the LLM Router dashboard in Grafana.
3335

3436
The router is implemented in two ways: Golang (with Rust FFI based on Candle) and Python. Benchmarking will be conducted to determine the best implementation.
3537

36-
### Auto-Selection of Tools
38+
#### Auto-Selection of Tools
3739

3840
Select the tools to use based on the prompt, avoiding the use of tools that are not relevant to the prompt so as to reduce the number of prompt tokens and improve tool selection accuracy by the LLM.
3941

40-
### PII detection
42+
### Enterprise Security 🔒
43+
44+
#### PII detection
4145

4246
Detect PII in the prompt, avoiding sending PII to the LLM so as to protect the privacy of the user.
4347

44-
### Prompt guard
48+
#### Prompt guard
4549

4650
Detect if the prompt is a jailbreak prompt, avoiding sending jailbreak prompts to the LLM so as to prevent the LLM from misbehaving.
4751

48-
### Semantic Caching
52+
### Similarity Caching ⚡️
4953

5054
Cache the semantic representation of the prompt so as to reduce the number of prompt tokens and improve the overall inference latency.
5155

52-
## 📖 Documentation
56+
## Documentation 📖
5357

5458
For comprehensive documentation including detailed setup instructions, architecture guides, and API references, visit:
5559

5660
**👉 [Complete Documentation at Read the Docs](https://vllm-semantic-router.com/)**
5761

5862
The documentation includes:
5963
- **[Installation Guide](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Complete setup instructions
60-
- **[Quick Start](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Get running in 5 minutes
6164
- **[System Architecture](https://vllm-semantic-router.com/docs/architecture/system-architecture/)** - Technical deep dive
6265
- **[Model Training](https://vllm-semantic-router.com/docs/training/training-overview/)** - How classification models work
6366
- **[API Reference](https://vllm-semantic-router.com/docs/api/router/)** - Complete API documentation
67+
68+
## Community 👋
69+
70+
For questions, feedback, or to contribute, please join `#semantic-router` channel in vLLM Slack.
71+
72+
## Citation
73+
74+
If you find Semantic Router helpful in your research or projects, please consider citing it:
75+
76+
```
77+
@misc{semanticrouter2025,
78+
title={vLLM Semantic Router},
79+
author={vLLM Semantic Router Team},
80+
year={2025},
81+
howpublished={\url{https://github.com/vllm-project/semantic-router}},
82+
}
83+
```

docker/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ This Docker Compose configuration allows you to quickly run Semantic Router + En
55
## Prerequisites
66

77
- Docker and Docker Compose
8-
- Ensure ports 8801, 50051, 19000, and 60000 are not in use
8+
- Ensure ports 8801, 50051, 19000 are not in use
99

1010
## Install in Docker Compose
1111

@@ -40,7 +40,6 @@ This Docker Compose configuration allows you to quickly run Semantic Router + En
4040
- Semantic Router: http://localhost:50051 (gRPC service)
4141
- Envoy Proxy: http://localhost:8801 (main endpoint)
4242
- Envoy Admin: http://localhost:19000 (admin interface)
43-
- Mock vLLM (testing): http://localhost:60000 (if using testing profile)
4443

4544
## Quick Start
4645

0 commit comments

Comments
 (0)