Skip to content

Commit 5fdb082

Browse files
committed
docs: update docs style and layout
Signed-off-by: bitliu <[email protected]>
1 parent 3029480 commit 5fdb082

File tree

15 files changed

+152
-86
lines changed

15 files changed

+152
-86
lines changed

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Contributing to LLM Semantic Router
1+
# Contributing to vLLM Semantic Router
22

3-
Thank you for your interest in contributing to the LLM Semantic Router project! This guide will help you get started with development and contributing to the project.
3+
Thank you for your interest in contributing to the vLLM Semantic Router project! This guide will help you get started with development and contributing to the project.
44

55
## Table of Contents
66

README.md

Lines changed: 4 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
<div align="center">
22

3-
<img src="website/static/img/repo.png" alt="LLM Semantic Router"/>
3+
<img src="website/static/img/repo.png" alt="vLLM Semantic Router"/>
44

55
[![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue)](https://llm-semantic-router.readthedocs.io/en/latest/)
66
[![Hugging Face](https://img.shields.io/badge/🤗%20Hugging%20Face-Community-yellow)](https://huggingface.co/LLM-Semantic-Router)
@@ -15,37 +15,9 @@
1515

1616
## Overview
1717

18-
```mermaid
19-
graph TB
20-
Client[Client Request] --> Router[vLLM Semantic Router]
21-
22-
subgraph "Intent Understanding"
23-
direction LR
24-
PII[PII Detector]
25-
Jailbreak[Jailbreak Guard]
26-
Category[Category Classifier]
27-
Cache[Semantic Cache]
28-
end
29-
30-
Router --> PII
31-
Router --> Jailbreak
32-
Router --> Category
33-
Router --> Cache
34-
35-
PII --> Decision{Security Check}
36-
Jailbreak --> Decision
37-
Decision -->|Block| Block[Block Request]
38-
Decision -->|Pass| Category
39-
Category --> Models[Route to Specialized Model]
40-
Cache -->|Hit| FastResponse[Return Cached Response]
41-
42-
Models --> Math[Math Model]
43-
Models --> Creative[Creative Model]
44-
Models --> Code[Code Model]
45-
Models --> General[General Model]
46-
```
47-
48-
### Auto-Selection of Models
18+
![](./website/static/img/architecture.png)
19+
20+
### Auto-Reasoning and Auto-Selection of Models
4921

5022
An **Mixture-of-Models** (MoM) router that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on **Semantic Understanding** of the request's intent (Complexity, Task, Tools).
5123

deploy/kubernetes/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The deployment consists of:
1717

1818
## Ports
1919

20-
- **50051**: gRPC API (LLM Semantic Router ExtProc)
20+
- **50051**: gRPC API (vLLM Semantic Router ExtProc)
2121
- **9190**: Prometheus metrics
2222

2323
## Deployment

src/semantic-router/cmd/main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ func main() {
4444
log.Fatalf("Failed to create ExtProc server: %v", err)
4545
}
4646

47-
log.Printf("Starting LLM Semantic Router ExtProc with config: %s", *configPath)
47+
log.Printf("Starting vLLM Semantic Router ExtProc with config: %s", *configPath)
4848

4949
// Start Classification API server if enabled
5050
if *enableAPI {

website/.docusaurus/client-manifest.json

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -366,9 +366,9 @@
366366
"849": {
367367
"js": [
368368
{
369-
"file": "assets/js/0058b4c6.33f169dd.js",
370-
"hash": "842afc77d0aa620b",
371-
"publicPath": "/assets/js/0058b4c6.33f169dd.js"
369+
"file": "assets/js/0058b4c6.5774ef6d.js",
370+
"hash": "72d4499e535070c8",
371+
"publicPath": "/assets/js/0058b4c6.5774ef6d.js"
372372
}
373373
]
374374
},
@@ -429,9 +429,9 @@
429429
"1869": {
430430
"css": [
431431
{
432-
"file": "assets/css/styles.f55e26d4.css",
433-
"hash": "3f6d30ecd8d89ed0",
434-
"publicPath": "/assets/css/styles.f55e26d4.css"
432+
"file": "assets/css/styles.267b8a8e.css",
433+
"hash": "8a94587058cfc753",
434+
"publicPath": "/assets/css/styles.267b8a8e.css"
435435
}
436436
]
437437
},
@@ -492,9 +492,9 @@
492492
"2634": {
493493
"js": [
494494
{
495-
"file": "assets/js/c4f5d8e4.f45b1ce6.js",
496-
"hash": "80e68f3177a7ce28",
497-
"publicPath": "/assets/js/c4f5d8e4.f45b1ce6.js"
495+
"file": "assets/js/c4f5d8e4.b7348ab3.js",
496+
"hash": "cb6219784beae8ad",
497+
"publicPath": "/assets/js/c4f5d8e4.b7348ab3.js"
498498
}
499499
]
500500
},
@@ -555,9 +555,9 @@
555555
"3976": {
556556
"js": [
557557
{
558-
"file": "assets/js/0e384e19.f8f3d3f3.js",
559-
"hash": "8c33224767770a06",
560-
"publicPath": "/assets/js/0e384e19.f8f3d3f3.js"
558+
"file": "assets/js/0e384e19.07a9307d.js",
559+
"hash": "f883753ab784d216",
560+
"publicPath": "/assets/js/0e384e19.07a9307d.js"
561561
}
562562
]
563563
},
@@ -663,9 +663,9 @@
663663
"5354": {
664664
"js": [
665665
{
666-
"file": "assets/js/runtime~main.71ea62a3.js",
667-
"hash": "e2dce0d6e0f4c1f2",
668-
"publicPath": "/assets/js/runtime~main.71ea62a3.js"
666+
"file": "assets/js/runtime~main.9ae70a89.js",
667+
"hash": "fd9cd34fb19d0d1d",
668+
"publicPath": "/assets/js/runtime~main.9ae70a89.js"
669669
}
670670
]
671671
},
@@ -753,9 +753,9 @@
753753
"7082": {
754754
"js": [
755755
{
756-
"file": "assets/js/4bf05604.88ac84d0.js",
757-
"hash": "4ec1afea60e64ac0",
758-
"publicPath": "/assets/js/4bf05604.88ac84d0.js"
756+
"file": "assets/js/4bf05604.e4033055.js",
757+
"hash": "e142232da15bc602",
758+
"publicPath": "/assets/js/4bf05604.e4033055.js"
759759
}
760760
]
761761
},
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
{"version":{"pluginId":"default","version":"current","label":"Next","banner":null,"badge":false,"noIndex":false,"className":"docs-version-current","isLast":true,"docsSidebars":{"tutorialSidebar":[{"type":"link","label":"LLM Semantic Router","href":"/docs/intro","docId":"intro","unlisted":false},{"type":"category","label":"Overview","items":[{"type":"link","label":"Semantic Router Overview","href":"/docs/overview/semantic-router-overview","docId":"overview/semantic-router-overview","unlisted":false},{"type":"link","label":"Why Mixture of Models?","href":"/docs/overview/mixture-of-models","docId":"overview/mixture-of-models","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"Architecture","items":[{"type":"link","label":"System Architecture","href":"/docs/architecture/system-architecture","docId":"architecture/system-architecture","unlisted":false},{"type":"link","label":"Envoy ExtProc Integration","href":"/docs/architecture/envoy-extproc","docId":"architecture/envoy-extproc","unlisted":false},{"type":"link","label":"Router Implementation Details","href":"/docs/architecture/router-implementation","docId":"architecture/router-implementation","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"Model Training","items":[{"type":"link","label":"Model Training Overview","href":"/docs/training/training-overview","docId":"training/training-overview","unlisted":false},{"type":"link","label":"Classification Models","href":"/docs/training/classification-models","docId":"training/classification-models","unlisted":false},{"type":"link","label":"Datasets and Purposes","href":"/docs/training/datasets","docId":"training/datasets","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"Getting Started","items":[{"type":"link","label":"Installation Guide","href":"/docs/getting-started/installation","docId":"getting-started/installation","unlisted":false},{"type":"link","label":"Quick Start Guide","href":"/docs/getting-started/quick-start","docId":"getting-started/quick-start","unlisted":false},{"type":"link","label":"Configuration Guide","href":"/docs/getting-started/configuration","docId":"getting-started/configuration","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"API Reference","items":[{"type":"link","label":"Router API Reference","href":"/docs/api/router","docId":"api/router","unlisted":false},{"type":"link","label":"Classification API Reference","href":"/docs/api/classification","docId":"api/classification","unlisted":false}],"collapsed":true,"collapsible":true}]},"docs":{"api/classification":{"id":"api/classification","title":"Classification API Reference","description":"The Classification API provides direct access to the Semantic Router's classification models for intent detection, PII identification, and security analysis. This API is useful for testing, debugging, and standalone classification tasks.","sidebar":"tutorialSidebar"},"api/router":{"id":"api/router","title":"Router API Reference","description":"The Semantic Router provides a gRPC-based API that integrates seamlessly with Envoy's External Processing (ExtProc) protocol. This document covers the API endpoints, request/response formats, and integration patterns.","sidebar":"tutorialSidebar"},"architecture/envoy-extproc":{"id":"architecture/envoy-extproc","title":"Envoy ExtProc Integration","description":"The Semantic Router leverages Envoy's External Processing (ExtProc) filter to implement intelligent routing decisions. This integration provides a clean separation between traffic management (Envoy) and business logic (Semantic Router), enabling sophisticated routing capabilities while maintaining high performance.","sidebar":"tutorialSidebar"},"architecture/router-implementation":{"id":"architecture/router-implementation","title":"Router Implementation Details","description":"This document provides detailed insights into the core routing algorithms, classification logic, and implementation specifics of the Semantic Router.","sidebar":"tutorialSidebar"},"architecture/system-architecture":{"id":"architecture/system-architecture","title":"System Architecture","description":"The Semantic Router implements a sophisticated Mixture-of-Models (MoM) architecture using Envoy Proxy as the foundation, with an External Processor (ExtProc) service that provides intelligent routing capabilities. This design ensures high performance, scalability, and maintainability for production LLM deployments.","sidebar":"tutorialSidebar"},"getting-started/configuration":{"id":"getting-started/configuration","title":"Configuration Guide","description":"This guide covers all configuration options available in the Semantic Router, from basic setup to advanced customization for production deployments.","sidebar":"tutorialSidebar"},"getting-started/installation":{"id":"getting-started/installation","title":"Installation Guide","description":"This guide will help you set up and install the Semantic Router on your system. The installation process includes setting up dependencies, downloading models, and configuring the routing system.","sidebar":"tutorialSidebar"},"getting-started/quick-start":{"id":"getting-started/quick-start","title":"Quick Start Guide","description":"This guide will get you up and running with the Semantic Router in just a few minutes. Follow these steps to see the router in action with intelligent model selection.","sidebar":"tutorialSidebar"},"intro":{"id":"intro","title":"LLM Semantic Router","description":"License","sidebar":"tutorialSidebar"},"overview/mixture-of-models":{"id":"overview/mixture-of-models","title":"Why Mixture of Models?","description":"The Mixture of Models (MoM) approach represents a fundamental shift from traditional single-model deployment to a more intelligent, cost-effective, and performance-optimized architecture. This section explores the compelling reasons why MoM has become the preferred approach for production LLM deployments.","sidebar":"tutorialSidebar"},"overview/semantic-router-overview":{"id":"overview/semantic-router-overview","title":"Semantic Router Overview","description":"Semantic routers represent a paradigm shift in how we deploy and utilize large language models at scale. By intelligently routing queries to the most appropriate model based on semantic understanding, these systems optimize the critical balance between performance, cost, and quality.","sidebar":"tutorialSidebar"},"training/classification-models":{"id":"training/classification-models","title":"Classification Models","description":"This document provides in-depth technical details about each classification model used in the Semantic Router, including architecture specifics, training procedures, and performance characteristics.","sidebar":"tutorialSidebar"},"training/datasets":{"id":"training/datasets","title":"Datasets and Purposes","description":"This document provides comprehensive details about the datasets used to train each classification model in the Semantic Router, including data sources, preprocessing methods, and the specific purposes each dataset serves in the routing pipeline.","sidebar":"tutorialSidebar"},"training/training-overview":{"id":"training/training-overview","title":"Model Training Overview","description":"The Semantic Router relies on multiple specialized classification models to make intelligent routing decisions. This section provides a comprehensive overview of the training process, datasets used, and the purpose of each model in the routing pipeline.","sidebar":"tutorialSidebar"}}}}
1+
{"version":{"pluginId":"default","version":"current","label":"Next","banner":null,"badge":false,"noIndex":false,"className":"docs-version-current","isLast":true,"docsSidebars":{"tutorialSidebar":[{"type":"link","label":"vLLM Semantic Router","href":"/docs/intro","docId":"intro","unlisted":false},{"type":"category","label":"Overview","items":[{"type":"link","label":"Semantic Router Overview","href":"/docs/overview/semantic-router-overview","docId":"overview/semantic-router-overview","unlisted":false},{"type":"link","label":"Why Mixture of Models?","href":"/docs/overview/mixture-of-models","docId":"overview/mixture-of-models","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"Architecture","items":[{"type":"link","label":"System Architecture","href":"/docs/architecture/system-architecture","docId":"architecture/system-architecture","unlisted":false},{"type":"link","label":"Envoy ExtProc Integration","href":"/docs/architecture/envoy-extproc","docId":"architecture/envoy-extproc","unlisted":false},{"type":"link","label":"Router Implementation Details","href":"/docs/architecture/router-implementation","docId":"architecture/router-implementation","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"Model Training","items":[{"type":"link","label":"Model Training Overview","href":"/docs/training/training-overview","docId":"training/training-overview","unlisted":false},{"type":"link","label":"Classification Models","href":"/docs/training/classification-models","docId":"training/classification-models","unlisted":false},{"type":"link","label":"Datasets and Purposes","href":"/docs/training/datasets","docId":"training/datasets","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"Getting Started","items":[{"type":"link","label":"Installation Guide","href":"/docs/getting-started/installation","docId":"getting-started/installation","unlisted":false},{"type":"link","label":"Quick Start Guide","href":"/docs/getting-started/quick-start","docId":"getting-started/quick-start","unlisted":false},{"type":"link","label":"Configuration Guide","href":"/docs/getting-started/configuration","docId":"getting-started/configuration","unlisted":false}],"collapsed":true,"collapsible":true},{"type":"category","label":"API Reference","items":[{"type":"link","label":"Router API Reference","href":"/docs/api/router","docId":"api/router","unlisted":false},{"type":"link","label":"Classification API Reference","href":"/docs/api/classification","docId":"api/classification","unlisted":false}],"collapsed":true,"collapsible":true}]},"docs":{"api/classification":{"id":"api/classification","title":"Classification API Reference","description":"The Classification API provides direct access to the Semantic Router's classification models for intent detection, PII identification, and security analysis. This API is useful for testing, debugging, and standalone classification tasks.","sidebar":"tutorialSidebar"},"api/router":{"id":"api/router","title":"Router API Reference","description":"The Semantic Router provides a gRPC-based API that integrates seamlessly with Envoy's External Processing (ExtProc) protocol. This document covers the API endpoints, request/response formats, and integration patterns.","sidebar":"tutorialSidebar"},"architecture/envoy-extproc":{"id":"architecture/envoy-extproc","title":"Envoy ExtProc Integration","description":"The Semantic Router leverages Envoy's External Processing (ExtProc) filter to implement intelligent routing decisions. This integration provides a clean separation between traffic management (Envoy) and business logic (Semantic Router), enabling sophisticated routing capabilities while maintaining high performance.","sidebar":"tutorialSidebar"},"architecture/router-implementation":{"id":"architecture/router-implementation","title":"Router Implementation Details","description":"This document provides detailed insights into the core routing algorithms, classification logic, and implementation specifics of the Semantic Router.","sidebar":"tutorialSidebar"},"architecture/system-architecture":{"id":"architecture/system-architecture","title":"System Architecture","description":"The Semantic Router implements a sophisticated Mixture-of-Models (MoM) architecture using Envoy Proxy as the foundation, with an External Processor (ExtProc) service that provides intelligent routing capabilities. This design ensures high performance, scalability, and maintainability for production LLM deployments.","sidebar":"tutorialSidebar"},"getting-started/configuration":{"id":"getting-started/configuration","title":"Configuration Guide","description":"This guide covers all configuration options available in the Semantic Router, from basic setup to advanced customization for production deployments.","sidebar":"tutorialSidebar"},"getting-started/installation":{"id":"getting-started/installation","title":"Installation Guide","description":"This guide will help you set up and install the Semantic Router on your system. The installation process includes setting up dependencies, downloading models, and configuring the routing system.","sidebar":"tutorialSidebar"},"getting-started/quick-start":{"id":"getting-started/quick-start","title":"Quick Start Guide","description":"This guide will get you up and running with the Semantic Router in just a few minutes. Follow these steps to see the router in action with intelligent model selection.","sidebar":"tutorialSidebar"},"intro":{"id":"intro","title":"vLLM Semantic Router","description":"License","sidebar":"tutorialSidebar"},"overview/mixture-of-models":{"id":"overview/mixture-of-models","title":"Why Mixture of Models?","description":"The Mixture of Models (MoM) approach represents a fundamental shift from traditional single-model deployment to a more intelligent, cost-effective, and performance-optimized architecture. This section explores the compelling reasons why MoM has become the preferred approach for production LLM deployments.","sidebar":"tutorialSidebar"},"overview/semantic-router-overview":{"id":"overview/semantic-router-overview","title":"Semantic Router Overview","description":"Semantic routers represent a paradigm shift in how we deploy and utilize large language models at scale. By intelligently routing queries to the most appropriate model based on semantic understanding, these systems optimize the critical balance between performance, cost, and quality.","sidebar":"tutorialSidebar"},"training/classification-models":{"id":"training/classification-models","title":"Classification Models","description":"This document provides in-depth technical details about each classification model used in the Semantic Router, including architecture specifics, training procedures, and performance characteristics.","sidebar":"tutorialSidebar"},"training/datasets":{"id":"training/datasets","title":"Datasets and Purposes","description":"This document provides comprehensive details about the datasets used to train each classification model in the Semantic Router, including data sources, preprocessing methods, and the specific purposes each dataset serves in the routing pipeline.","sidebar":"tutorialSidebar"},"training/training-overview":{"id":"training/training-overview","title":"Model Training Overview","description":"The Semantic Router relies on multiple specialized classification models to make intelligent routing decisions. This section provides a comprehensive overview of the training process, datasets used, and the purpose of each model in the routing pipeline.","sidebar":"tutorialSidebar"}}}}

0 commit comments

Comments
 (0)