Skip to content

Commit be3eb61

Browse files
author
yinmin
committed
delete image
1 parent 40913d7 commit be3eb61

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

website/blog/2025-10-30-milvus.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -258,15 +258,14 @@ sys 0m0.021s
258258

259259
This test demonstrates Semantic Router's semantic caching in action. By leveraging Milvus as the vector database, it efficiently matches semantically similar queries, improving response times when users ask the same or similar questions.
260260

261-
![performance-comparison](/img/performance-comparison.png)
261+
262262

263263
## Conclusion
264264

265265
As AI workloads grow and cost optimization becomes essential, the combination of vLLM Semantic Router and Milvus provides a practical way to scale intelligently. By routing each query to the right model and caching semantically similar results with a distributed vector database, this setup cuts compute overhead while keeping responses fast and consistent across use cases.
266266

267267
In short, you get smarter scaling—less brute force, more brains.
268268

269-
![smart-scaling](/img/smart-scaling.png)
270269

271270
---
272271

0 commit comments

Comments
 (0)