vllm-project
diff --git a/‎README.md‎
Lines changed: 28 additions & 8 deletions b/‎README.md‎
Lines changed: 28 additions & 8 deletions
diff --git a/‎docker/README.md‎
Lines changed: 1 addition & 2 deletions b/‎docker/README.md‎
Lines changed: 1 addition & 2 deletions
@@ -13,11 +13,13 @@
 
 </div>
 
-## Overview
+## Innovations ✨
 
 ![](./website/static/img/architecture.png)
 
-### Auto-Reasoning and Auto-Selection of Models
+### Intelligent Routing 🧠
+
+#### Auto-Reasoning and Auto-Selection of Models
 
 An **Mixture-of-Models** (MoM) router that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on **Semantic Understanding** of the request's intent (Complexity, Task, Tools).
 
@@ -33,31 +35,49 @@ The screenshot below shows the LLM Router dashboard in Grafana.
 
 The router is implemented in two ways: Golang (with Rust FFI based on Candle) and Python. Benchmarking will be conducted to determine the best implementation.
 
-### Auto-Selection of Tools
+#### Auto-Selection of Tools
 
 Select the tools to use based on the prompt, avoiding the use of tools that are not relevant to the prompt so as to reduce the number of prompt tokens and improve tool selection accuracy by the LLM.
 
-### PII detection
+### Enterprise Security 🔒
+
+#### PII detection
 
 Detect PII in the prompt, avoiding sending PII to the LLM so as to protect the privacy of the user.
 
-### Prompt guard
+#### Prompt guard
 
 Detect if the prompt is a jailbreak prompt, avoiding sending jailbreak prompts to the LLM so as to prevent the LLM from misbehaving.
 
-### Semantic Caching
+### Similarity Caching ⚡️
 
 Cache the semantic representation of the prompt so as to reduce the number of prompt tokens and improve the overall inference latency.
 
-## 📖 Documentation
+## Documentation 📖
 
 For comprehensive documentation including detailed setup instructions, architecture guides, and API references, visit:
 
 **👉 [Complete Documentation at Read the Docs](https://vllm-semantic-router.com/)**
 
 The documentation includes:
 - **[Installation Guide](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Complete setup instructions
-- **[Quick Start](https://vllm-semantic-router.com/docs/getting-started/installation/)** - Get running in 5 minutes
 - **[System Architecture](https://vllm-semantic-router.com/docs/architecture/system-architecture/)** - Technical deep dive
 - **[Model Training](https://vllm-semantic-router.com/docs/training/training-overview/)** - How classification models work
 - **[API Reference](https://vllm-semantic-router.com/docs/api/router/)** - Complete API documentation
+
+## Community 👋
+
+For questions, feedback, or to contribute, please join `#semantic-router` channel in vLLM Slack.
+
+## Citation
+
+If you find Semantic Router helpful in your research or projects, please consider citing it:
+
+```
+@misc{semanticrouter2025,
+  title={vLLM Semantic Router},
+  author={vLLM Semantic Router Team},
+  year={2025},
+  howpublished={\url{https://github.com/vllm-project/semantic-router}},
+}
+```
@@ -5,7 +5,7 @@ This Docker Compose configuration allows you to quickly run Semantic Router + En
 ## Prerequisites
 
 - Docker and Docker Compose
-- Ensure ports 8801, 50051, 19000, and 60000 are not in use
+- Ensure ports 8801, 50051, 19000 are not in use
 
 ## Install in Docker Compose
 
@@ -40,7 +40,6 @@ This Docker Compose configuration allows you to quickly run Semantic Router + En
    - Semantic Router: http://localhost:50051 (gRPC service)
    - Envoy Proxy: http://localhost:8801 (main endpoint)
    - Envoy Admin: http://localhost:19000 (admin interface)
-   - Mock vLLM (testing): http://localhost:60000 (if using testing profile)
 
 ## Quick Start