Skip to content

Conversation

k50112113
Copy link

@k50112113 k50112113 commented Sep 29, 2025

Previously, VLLM_ROCM_USE_AITER_TRITON_FUSED_ROPE_ZEROS_KV_CACHE will be disabled if VLLM_ROCM_USE_AITER_MHA is turned on.

This PR enables VLLM_ROCM_USE_AITER_MHA and VLLM_ROCM_USE_AITER_TRITON_FUSED_ROPE_ZEROS_KV_CACHE both to be turned on

This change would affect Llama and GPT-OSS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant