fix document

yinmin · yinmin · commit a1b582663410 · 2025-11-03T17:03:25.000+08:00
diff --git a/website/blog/2025-10-30-milvus.md b/website/blog/2025-10-30-milvus.md
@@ -1,15 +1,14 @@
 ---
 slug: semantic-router-milvus
 title: "vLLM Semantic Router + Milvus: How Semantic Routing and Caching Build Scalable AI Systems the Smart Way"
-authors: [min yin]
+authors: [minyin]
 tags: [semantic-router, milvus, caching, scalability, ai-systems]
 ---
 
 # vLLM Semantic Router + Milvus: How Semantic Routing and Caching Build Scalable AI Systems the Smart Way
 
 Most AI apps rely on a single model for every request. But that approach quickly runs into limits. Large models are powerful yet expensive, even when they're used for simple queries. Smaller models are cheaper and faster but can't handle complex reasoning. When traffic surges—say your AI app suddenly goes viral with ten million users overnight—the inefficiency of this one-model-for-all setup becomes painfully apparent. Latency spikes, GPU bills explode, and the model that ran fine yesterday starts gasping for air.
 
-
 <!-- truncate -->
 
 And my friend, you, the engineer behind this app, have to fix it—fast.
@@ -69,10 +68,8 @@ In developer tools or IDE assistants, many queries overlap—syntax help, API lo
 
 Enterprise queries tend to repeat over time—policy lookups, compliance references, product FAQs. With Milvus as the semantic cache layer, frequently asked questions and their answers can be stored and retrieved efficiently. This minimizes redundant computation while keeping responses consistent across departments and regions.
 
-
 Under the hood, the Semantic Router + Milvus pipeline is implemented in Go and Rust for high performance and low latency. Integrated at the gateway layer, it continuously monitors key metrics—like hit rates, routing latency, and model performance—to fine-tune routing strategies in real time.
 
-
 ## How to Quickly Test the Semantic Caching in the Semantic Router
 
 Before deploying semantic caching at scale, it's useful to validate how it behaves in a controlled setup. In this section, we'll walk through a quick local test that shows how the Semantic Router uses Milvus as its semantic cache. You'll see how similar queries hit the cache instantly while new or distinct ones trigger model generation—proving the caching logic in action.
@@ -99,6 +96,7 @@ Start the Milvus service.
 docker-compose up -d
 
 ```
+
 ![docker-compose](/img/docker-compose.png)
 
 ```
@@ -107,9 +105,6 @@ docker-compose ps -a
 
 ```
 
-
-
-
 ### 2. Clone the project
 
 ```bash
@@ -275,4 +270,4 @@ In short, you get smarter scaling—less brute force, more brains.
 
 ---
 
-If you'd like to explore this further, join the conversation in our Milvus Discord or open an issue on GitHub. You can also book a 20-minute Milvus Office Hours session for one-on-one guidance, insights, and technical deep dives from the team behind Milvus.
+If you'd like to explore this further, join the conversation in our Milvus Discord or open an issue on GitHub. You can also book a 20-minute Milvus Office Hours session for one-on-one guidance, insights, and technical deep dives from the team behind Milvus.
diff --git a/website/blog/authors.yml b/website/blog/authors.yml
@@ -21,3 +21,10 @@ Xunzhuo:
   title: Software Engineer @ Tencent
   url: https://github.com/Xunzhuo
   image_url: /img/team/xunzhuo.png
+
+Xunzhuo:
+  name: Min Yin
+  title: Milvus Ambassador
+  url: https://github.com/yinmin2020
+  image_url: /img/team/yinmin.jpg
+  
diff --git a/website/static/img/team/yinmin.jpg b/website/static/img/team/yinmin.jpg