Skip to content

Commit 2f757dd

Browse files
committed
update: resolve feedbacks
Signed-off-by: bitliu <[email protected]>
1 parent 6b879b5 commit 2f757dd

File tree

1 file changed

+1
-7
lines changed

1 file changed

+1
-7
lines changed

_posts/2025-09-01-semantic-router.md

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -64,15 +64,13 @@ Experimental results show:
6464

6565
In knowledge-intensive areas such as business and economics, accuracy improvements can exceed **20%**.
6666

67-
---
68-
6967
## Project Background
7068

7169
The Semantic Router is not the isolated result of a single paper but a collaborative outcome of sustained community contributions:
7270

7371
* Originally proposed by **[Dr. Chen Huamin](https://www.linkedin.com/in/huaminchen)**, Distinguished Engineer at **Red Hat**, in early **2025** across multiple open-source communities.
7472
* Iterated and further developed by **[Xunzhuo Liu](https://www.linkedin.com/in/bitliu)** at **Tencent**, later contributed to the vLLM community.
75-
* **[Dr. Wang Chen](https://www.linkedin.com/in/chenw615)** from **IBM Research** and **Dr. Chen Huamin** will present the project at **KubeCon North America 2025**.
73+
* **[Dr. Wang Chen](https://www.linkedin.com/in/chenw615)** from **IBM Research** and **Dr. Chen Huamin** will present the project at **[KubeCon North America 2025](https://kccncna2025.sched.com/event/27FaI/intelligent-llm-routing-a-new-paradigm-for-multi-model-ai-orchestration-in-kubernetes-chen-wang-ibm-research-huamin-chen-red-hat?iframe=no&w=100%&sidebar=yes&bg=no)**.
7674

7775
The mission is clear: to serve as an **inference accelerator** for open-source large models:
7876

@@ -84,8 +82,6 @@ The vLLM Semantic Router is therefore not just a research milestone but an **ess
8482

8583
You can start exploring the project at [GitHub](https://github.com/vllm-project/semantic-router). We're currently working on the [v0.1 Roadmap](https://github.com/vllm-project/semantic-router/issues/14) and have established a [Work Group](https://github.com/vllm-project/semantic-router/issues/15). We welcome your thoughts and invite you to join us!
8684

87-
---
88-
8985
## Future Trends: Cost-Effective, Just-in-Time Inference
9086

9187
The central industry question has shifted from *"Can we perform inference?"* to *"When and how should inference be performed?"*
@@ -101,8 +97,6 @@ The new competitive focus will be less about model scale and more about:
10197

10298
The next frontier is **intelligent, self-adjusting inference mechanisms** — systems that autonomously determine when to "think deeply" and when to respond directly, without explicit user toggles or hardcoded rules.
10399

104-
---
105-
106100
## One-Sentence Summary
107101

108102
* **GPT-5**: Business-driven routing → broad intelligence.

0 commit comments

Comments
 (0)