test image tag

simon-mo · simon-mo · commit 1077607dc5a0 · 2024-12-15T15:10:08.000-08:00
diff --git a/_config.yml b/_config.yml
@@ -3,6 +3,7 @@ author:
   name: © 2024. vLLM Team. All rights reserved.
   github: https://github.com/vllm-project/vllm
 google_analytics: G-9C5R3JR3QS
+url: blog.vllm.ai
 
 # The `>` after `description:` means to ignore line-breaks until next key.
 # If you want to omit the line-break after the end of text, use `>-` instead.
diff --git a/_posts/2024-10-23-vllm-serving-amd.md b/_posts/2024-10-23-vllm-serving-amd.md
@@ -2,6 +2,7 @@
 layout: post
 title: "Serving LLMs on AMD MI300X: Best Practices"
 author: "Guest Post by Embedded LLM and Hot Aisle Inc."
+image: /assets/figures/vllm-serving-amd/405b1.png
 ---
 
 **TL;DR:** vLLM unlocks incredible performance on the AMD MI300X, achieving 1.5x higher throughput and 1.7x faster time-to-first-token (TTFT) than Text Generation Inference (TGI) for Llama 3.1 405B. It also achieves 1.8x higher throughput and 5.1x faster TTFT than TGI for Llama 3.1 70B. This guide explores 8 key vLLM settings to maximize efficiency, showing you how to leverage the power of open-source LLM inference on AMD. If you just want to see the optimal parameters, jump to the [Quick Start Guide](#quick-start-guide).