Skip to content

Commit 1077607

Browse files
committed
test image tag
1 parent cf68f98 commit 1077607

File tree

2 files changed

+2
-0
lines changed

2 files changed

+2
-0
lines changed

_config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ author:
33
name: © 2024. vLLM Team. All rights reserved.
44
github: https://github.com/vllm-project/vllm
55
google_analytics: G-9C5R3JR3QS
6+
url: blog.vllm.ai
67

78
# The `>` after `description:` means to ignore line-breaks until next key.
89
# If you want to omit the line-break after the end of text, use `>-` instead.

_posts/2024-10-23-vllm-serving-amd.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
layout: post
33
title: "Serving LLMs on AMD MI300X: Best Practices"
44
author: "Guest Post by Embedded LLM and Hot Aisle Inc."
5+
image: /assets/figures/vllm-serving-amd/405b1.png
56
---
67

78
**TL;DR:** vLLM unlocks incredible performance on the AMD MI300X, achieving 1.5x higher throughput and 1.7x faster time-to-first-token (TTFT) than Text Generation Inference (TGI) for Llama 3.1 405B. It also achieves 1.8x higher throughput and 5.1x faster TTFT than TGI for Llama 3.1 70B. This guide explores 8 key vLLM settings to maximize efficiency, showing you how to leverage the power of open-source LLM inference on AMD. If you just want to see the optimal parameters, jump to the [Quick Start Guide](#quick-start-guide).

0 commit comments

Comments
 (0)