Skip to content

Commit d3bfc03

Browse files
authored
project: add news section in readme (#460)
Signed-off-by: bitliu <[email protected]>
1 parent ff181e9 commit d3bfc03

File tree

1 file changed

+29
-34
lines changed

1 file changed

+29
-34
lines changed

README.md

Lines changed: 29 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
<div align="center">
22

3-
<img src="website/static/img/repo.png" alt="vLLM Semantic Router" width="80%"/>
3+
<img src="website/static/img/repo.png" alt="vLLM Semantic Router" width="60%"/>
44

55
[![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue)](https://vllm-semantic-router.com)
66
[![Hugging Face](https://img.shields.io/badge/🤗%20Hugging%20Face-Community-yellow)](https://huggingface.co/LLM-Semantic-Router)
@@ -14,6 +14,27 @@
1414

1515
</div>
1616

17+
---
18+
19+
*Latest News* 🔥
20+
21+
- [2025/10/16] We established the [vLLM Semantic Router Youtube Channel](https://www.youtube.com/@vLLMSemanticRouter) ✨.
22+
- [2025/10/15] We announced the [vLLM Semantic Router Dashboard](https://www.youtube.com/watch?v=E2IirN8PsFw) 🚀.
23+
- [2025/10/12] Our paper [When to Reason: Semantic Router for vLLM](https://arxiv.org/abs/2510.08731) accepted by NeurIPS 2025 MLForSys 🧠.
24+
- [2025/10/08] We announced the integration with [vLLM Production Stack](https://github.com/vllm-project/production-stack) Tean 👋.
25+
- [2025/10/01] We supported to deploy on [Kubernetes](https://vllm-semantic-router.com/docs/installation/kubernetes/) 🌊.
26+
- [2025/09/15] We reached 1000 stars on GitHub! 🔥
27+
- [2025/09/01] We released the project officially: [vLLM Semantic Router: Next Phase in LLM inference](https://blog.vllm.ai/2025/09/11/semantic-router.html) 🚀.
28+
29+
<details>
30+
<summary>Previous News 🔥</summary>
31+
32+
-
33+
34+
</details>
35+
36+
---
37+
1738
## Innovations ✨
1839

1940
![architecture](./website/static/img/architecture.png)
@@ -66,11 +87,15 @@ Cache the semantic representation of the prompt so as to reduce the number of pr
6687

6788
Comprehensive observability with OpenTelemetry distributed tracing provides fine-grained visibility into the request processing pipeline.
6889

69-
### Open WebUI Integration 💬
90+
### vLLM Semantic Router Dashboard 💬
7091

71-
To view the ***Chain-Of-Thought*** of the vLLM-SR's decision-making process, we have integrated with Open WebUI.
92+
Watch the quick demo of the dashboard below:
7293

73-
![code](./website/static/img/chat.png)
94+
<div align="center">
95+
<a href="https://www.youtube.com/watch?v=E2IirN8PsFw">
96+
<img src="https://img.youtube.com/vi/E2IirN8PsFw/maxresdefault.jpg" alt="vLLM Semantic Router Dashboard" width="90%">
97+
</a>
98+
</div>
7499

75100
## Quick Start 🚀
76101

@@ -91,36 +116,6 @@ This command will:
91116

92117
For detailed installation and configuration instructions, see the [Complete Documentation](https://vllm-semantic-router.com/docs/installation/).
93118

94-
### What This Starts By Default
95-
96-
`make docker-compose-up` now launches the full stack including a lightweight local OpenAI-compatible model server powered by **llm-katan** (serving the small model `Qwen/Qwen3-0.6B` under the alias `qwen3`). The semantic router is configured to route classification & default generations to this local endpoint out-of-the-box. This gives you an entirely self-contained experience (no external API keys required) while still letting you add remote / larger models later.
97-
98-
### Core Mode (Without Local Model)
99-
100-
If you only want the core semantic-router + Envoy + observability stack (and will point to external OpenAI-compatible endpoints yourself):
101-
102-
```bash
103-
make docker-compose-up-core
104-
```
105-
106-
### Prerequisite Model Download (Speeds Up First Run)
107-
108-
The existing model bootstrap targets now also pre-download the small llm-katan model so the first `docker-compose-up` avoids an on-demand Hugging Face fetch.
109-
110-
Minimal set (fast):
111-
112-
```bash
113-
make models-download-minimal
114-
```
115-
116-
Full set:
117-
118-
```bash
119-
make models-download
120-
```
121-
122-
Both create a stamp file once `Qwen/Qwen3-0.6B` is present to keep subsequent runs idempotent.
123-
124119
## Documentation 📖
125120

126121
For comprehensive documentation including detailed setup instructions, architecture guides, and API references, visit:

0 commit comments

Comments
 (0)