fix: fix reasoning-routing-quickstart md styel

yuluo-yx · yuluo-yx · commit a017191ce54e · 2025-09-14T09:32:39.000+08:00
Signed-off-by: yuluo-yx &lt;yuluo08290126@gmail.com&gt;
diff --git a/website/docs/getting-started/reasoning-routing-quickstart.md b/website/docs/getting-started/reasoning-routing-quickstart.md
@@ -1,16 +1,19 @@
 # Reasoning Routing Quickstart
 
 This short guide shows how to enable and verify “reasoning routing” in the Semantic Router:
+
 - Minimal config.yaml fields you need
 - Example request/response (OpenAI-compatible)
 - A comprehensive evaluation command you can run
 
 Prerequisites
+
 - A running OpenAI-compatible backend for your models (e.g., vLLM or any OpenAI-compatible server). It must be reachable at the addresses you configure under vllm_endpoints (address:port).
 - Envoy + the router (see Start the router section)
 
 1) Minimal configuration
 Put this in config/config.yaml (or merge into your existing config). It defines:
+
 - Categories that require reasoning (e.g., math)
 - Reasoning families for model syntax differences (DeepSeek/Qwen3 use chat_template_kwargs; GPT-OSS/GPT use reasoning_effort)
 - Which concrete models use which reasoning family
@@ -84,6 +87,7 @@ default_model: qwen3-30b
 ```
 
 Notes
+
 - Reasoning is controlled by categories.use_reasoning and optionally categories.reasoning_effort.
 - A model only gets reasoning fields if it has a model_config.&lt;MODEL&gt;.reasoning_family that maps to a reasoning_families entry.
 - DeepSeek/Qwen3 (chat_template_kwargs): the router injects chat_template_kwargs only when reasoning is enabled. When disabled, no chat_template_kwargs are added.
@@ -93,6 +97,7 @@ Notes
 
 2) Start the router
 Option A: Local build + Envoy
+
 - Download classifier models and mappings (required)
   - make download-models
 - Build and run the router
@@ -102,13 +107,15 @@ Option A: Local build + Envoy
   - func-e run --config-path config/envoy.yaml --component-log-level "ext_proc:trace,router:trace,http:trace"
 
 Option B: Docker Compose
+
 - docker compose up -d
   - Exposes Envoy at http://localhost:8801 (proxying /v1/* to backends via the router)
 
 Note: Ensure your OpenAI-compatible backend is running and reachable (e.g., http://127.0.0.1:8000) so that vllm_endpoints address:port matches a live server. Without a running backend, routing will fail at the Envoy hop.
 
 3) Send example requests
 Math (reasoning should be ON and effort high)
+
 ```bash
 curl -sS http://localhost:8801/v1/chat/completions \
   -H "Content-Type: application/json" \
@@ -122,6 +129,7 @@ curl -sS http://localhost:8801/v1/chat/completions \
 ```
 
 General (reasoning should be OFF)
+
 ```bash
 curl -sS http://localhost:8801/v1/chat/completions \
   -H "Content-Type: application/json" \
@@ -136,10 +144,12 @@ curl -sS http://localhost:8801/v1/chat/completions \
 
 Verify routing via response headers
 The router does not inject routing metadata into the JSON body. Instead, inspect the response headers added by the router:
+
 - X-Selected-Model
 - X-Semantic-Destination-Endpoint
 
 Example:
+
 ```bash
 curl -i http://localhost:8801/v1/chat/completions \
   -H "Content-Type: application/json" \
@@ -159,6 +169,7 @@ curl -i http://localhost:8801/v1/chat/completions \
 You can benchmark the router vs a direct vLLM endpoint across categories using the included script. This runs a ReasoningBench based on MMLU-Pro and produces summaries and plots.
 
 Quick start (router + vLLM):
+
 ```bash
 SAMPLES_PER_CATEGORY=25 \
 CONCURRENT_REQUESTS=4 \
@@ -168,6 +179,7 @@ VLLM_MODELS="openai/gpt-oss-20b" \
 ```
 
 Router-only benchmark:
+
 ```bash
 BENCHMARK_ROUTER_ONLY=true \
 SAMPLES_PER_CATEGORY=25 \
@@ -177,6 +189,7 @@ ROUTER_MODELS="auto" \
 ```
 
 Direct invocation (advanced):
+
 ```bash
 python bench/router_reason_bench.py \
   --run-router \
@@ -191,8 +204,8 @@ python bench/router_reason_bench.py \
 ```
 
 Tips
+
 - If your math request doesn’t enable reasoning, confirm the classifier assigns the "math" category with sufficient confidence (see classifier.category_model.threshold) and that the target model has a reasoning_family.
 - For models without a reasoning_family, the router will not inject reasoning fields even when the category requires reasoning (this is by design to avoid invalid requests).
 - You can override the effort per category via categories.reasoning_effort or set a global default via default_reasoning_effort.
 - Ensure your OpenAI-compatible backend is reachable at the configured vllm_endpoints (address:port). If it’s not running, routing will fail even though the router and Envoy are up.
-