[core] MLA performance boost for AMD GPUs and tuned MoE config for MI… by qli88 · Pull Request #13439 · vllm-project/vllm

qli88 · 2025-02-18T00:19:01Z

Params tweak in MLA kernel for AMD GPUs to improve performance;
Tuned MoE config for DS v3/r1 on MI300X

github-actions · 2025-02-18T00:19:13Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mgoin

LGTM with a small fix needed, cc @LucasWilkinson

vllm/attention/ops/triton_decode_attention.py

…300X Signed-off-by: qli88 <qiang.li2@amd.com>

Signed-off-by: qli88 <qiang.li2@amd.com>

Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Signed-off-by: qli88 <qiang.li2@amd.com>

hongxiayang · 2025-02-21T01:28:12Z

cc @houseroad

hongxiayang · 2025-02-21T14:28:22Z

vllm/attention/ops/triton_decode_attention.py

    BLOCK = 64
+    if is_hip_:
+        BLOCK = 8
+


Suggested change

BLOCK = 64

if is_hip_:

BLOCK = 8

BLOCK = 64 if not is_hip_ else 8

houseroad · 2025-02-23T06:15:12Z

why do we close the PR?

qli88 · 2025-02-23T06:21:48Z

why do we close the PR?

@houseroad I created a new PR to adapt the commit landed yesterday (#12639). Please take a look at that one (#13718)

qli88 force-pushed the mla_perf_boost_for_amd branch from 7e481ee to 7a76f70 Compare February 18, 2025 00:22

hongxiayang added the rocm Related to AMD ROCm label Feb 18, 2025

mgoin reviewed Feb 19, 2025

View reviewed changes

vllm/attention/ops/triton_decode_attention.py Outdated Show resolved Hide resolved

qli88 force-pushed the mla_perf_boost_for_amd branch 2 times, most recently from 70a9795 to 4f2422a Compare February 19, 2025 20:00

hongxiayang reviewed Feb 20, 2025

View reviewed changes

vllm/attention/ops/triton_decode_attention.py Outdated Show resolved Hide resolved

qli88 and others added 3 commits February 20, 2025 16:31

[core] MLA performance boost for AMD GPUs and tuned MoE config for MI…

c3c4997

…300X Signed-off-by: qli88 <qiang.li2@amd.com>

remove the redundant line

a77294e

Signed-off-by: qli88 <qiang.li2@amd.com>

Update vllm/attention/ops/triton_decode_attention.py

148e877

Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Signed-off-by: qli88 <qiang.li2@amd.com>

qli88 force-pushed the mla_perf_boost_for_amd branch from c131305 to 148e877 Compare February 20, 2025 16:32

hongxiayang reviewed Feb 21, 2025

View reviewed changes

hongxiayang approved these changes Feb 21, 2025

View reviewed changes

qli88 closed this Feb 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[core] MLA performance boost for AMD GPUs and tuned MoE config for MI…#13439

[core] MLA performance boost for AMD GPUs and tuned MoE config for MI…#13439
qli88 wants to merge 3 commits intovllm-project:mainfrom
ROCm:mla_perf_boost_for_amd

qli88 commented Feb 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 18, 2025

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Uh oh!

hongxiayang commented Feb 21, 2025

Uh oh!

hongxiayang Feb 21, 2025

Uh oh!

houseroad commented Feb 23, 2025

Uh oh!

qli88 commented Feb 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

qli88 commented Feb 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 18, 2025

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hongxiayang commented Feb 21, 2025

Uh oh!

hongxiayang Feb 21, 2025

Choose a reason for hiding this comment

Uh oh!

houseroad commented Feb 23, 2025

Uh oh!

qli88 commented Feb 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qli88 commented Feb 18, 2025 •

edited by github-actions bot

Loading

qli88 commented Feb 23, 2025 •

edited

Loading