diff --git a/website/docs/intro.md b/website/docs/intro.md index 552c2435..89255714 100644 --- a/website/docs/intro.md +++ b/website/docs/intro.md @@ -46,8 +46,10 @@ Our testing shows significant improvements in model accuracy through specialized ## 🛠️ Architecture Overview -```mermaid -graph TB +import ZoomableMermaid from '@site/src/components/ZoomableMermaid'; + + +{`graph TB Client[Client Request] --> Envoy[Envoy Proxy] Envoy --> Router[Semantic Router ExtProc] @@ -74,8 +76,8 @@ graph TB Models --> Math[Math Model] Models --> Creative[Creative Model] Models --> Code[Code Model] - Models --> General[General Model] -``` + Models --> General[General Model]`} + ## 🎯 Use Cases diff --git a/website/docs/overview/mixture-of-models.md b/website/docs/overview/mixture-of-models.md index fc130fee..6e9dbb3e 100644 --- a/website/docs/overview/mixture-of-models.md +++ b/website/docs/overview/mixture-of-models.md @@ -100,8 +100,10 @@ Different models excel at different tasks. MoM leverages this specialization: #### 3. **Improved System Reliability** -```mermaid -graph TB +import ZoomableMermaid from '@site/src/components/ZoomableMermaid'; + + +{`graph TB subgraph "Single Model Risk" SingleQuery[Query] --> SingleModel[GPT-4] SingleModel -->|Failure| SingleFailure[Complete System Down] @@ -114,8 +116,8 @@ graph TB Router --> Model3[Model C] Model1 -->|Failure| Fallback[Automatic Fallback] Fallback --> Model2 - end -``` + end`} + **Reliability Benefits:** @@ -276,8 +278,8 @@ subject_routing = { MoM architecture supports various deployment strategies: -```mermaid -graph TB + +{`graph TB subgraph "Cloud Deployment" CloudQueries[Queries] --> CloudRouter[Cloud Router] CloudRouter --> OpenAI[OpenAI GPT] @@ -295,8 +297,8 @@ graph TB OnPremQueries[Queries] --> OnPremRouter[On-Prem Router] OnPremRouter --> LocalLLaMA[Local LLaMA Models] OnPremRouter --> FineTuned[Fine-tuned Specialized Models] - end -``` + end`} + ### 2. **A/B Testing and Gradual Rollouts** diff --git a/website/docs/overview/semantic-router-overview.md b/website/docs/overview/semantic-router-overview.md index 958f04d6..7cd92054 100644 --- a/website/docs/overview/semantic-router-overview.md +++ b/website/docs/overview/semantic-router-overview.md @@ -151,8 +151,10 @@ GPT-5 introduces a revolutionary **router-as-coordinator** architecture: **Operational Flow:** -```mermaid -sequenceDiagram +import ZoomableMermaid from '@site/src/components/ZoomableMermaid'; + + +{`sequenceDiagram participant User participant Router as GPT-5 Router participant Math as Math Specialist @@ -164,8 +166,8 @@ sequenceDiagram Router->>Router: Analyze query intent Router->>Math: Route to math specialist Math->>Router: Mathematical solution - Router->>User: Optimized response -``` + Router->>User: Optimized response`} + **Business Impact:** diff --git a/website/docs/proposals/nvidia-dynamo-integration.md b/website/docs/proposals/nvidia-dynamo-integration.md index ea44f635..a6933ec3 100644 --- a/website/docs/proposals/nvidia-dynamo-integration.md +++ b/website/docs/proposals/nvidia-dynamo-integration.md @@ -528,8 +528,11 @@ prompt_guard: ### 4.3 System Architecture -```mermaid -graph TB +import ZoomableMermaid from '@site/src/components/ZoomableMermaid'; + + + +{`graph TB Client[LLM Application
OpenAI SDK] subgraph Main["Main Processing Flow"] @@ -628,8 +631,8 @@ graph TB style DynamoRouter fill:#c8e6c9 style SemanticCache fill:#fff9c4 style KVBM fill:#fff9c4 - style SL fill:#f5f5f5 -``` + style SL fill:#f5f5f5`} +
**Architecture Layers:** diff --git a/website/docs/proposals/prompt-classification-routing.md b/website/docs/proposals/prompt-classification-routing.md index 653d8b59..9d683904 100644 --- a/website/docs/proposals/prompt-classification-routing.md +++ b/website/docs/proposals/prompt-classification-routing.md @@ -121,8 +121,10 @@ embedding_similarity: ### High-Level System Design -```mermaid -graph TD +import ZoomableMermaid from '@site/src/components/ZoomableMermaid'; + + +{`graph TD A[Envoy External Processor
semantic-router ExtProc] --> B[Request Handler
handleModelRouting] B --> C{Execution Path} @@ -166,8 +168,8 @@ graph TD style E2 fill:#fff9c4 style F fill:#c8e6c9 style H fill:#ffcdd2 - style M fill:#c8e6c9 -``` + style M fill:#c8e6c9`} +
### Component Breakdown