Add support for AITER attention backend #1

lauri9 · 2025-10-24T12:02:21Z

What does this PR do?

AITER is AMD’s centralized repository to support high performance AI operators such as attention kernels for AMD ROCm enabled accelerators. This PR adds support for FlashAttention through AITER by introducing a new attention backend.

Test code for Flux inference below. Requires installation of aiter>=0.15.0 and a supported ROCm enabled accelerator.

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel, attention_backend

model_id = "black-forest-labs/FLUX.1-dev"
transformer = FluxTransformer2DModel.from_pretrained(model_id, subfolder="transformer", torch_dtype=torch.bfloat16, device_map="cuda")
transformer.set_attention_backend("aiter")
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16)
pipe.text_encoder.to("cuda")
pipe.text_encoder_2.to("cuda")
pipe.vae.to("cuda")

prompt = "A cat holding a sign that says 'hello world'"

image = pipe(prompt, num_inference_steps=28, guidance_scale=4.0).images[0]
image.save("output.png")

We are interested in following up this PR by eventually also enabling support for context parallelism across multiple devices.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

avjves · 2025-10-24T12:26:04Z

src/diffusers/models/attention_dispatch.py


+@_AttentionBackendRegistry.register(
+    AttentionBackendName.AITER,
+    constraints=[_check_device, _check_qkv_dtype_bf16_or_fp16, _check_shape],


Should we have _check_device_rocm? Not sure if one can accidentally install aiter on NV, but AFAIK nothing here would then prevent from trying to run it.

Hmm, perhaps we should change this to _check_device_cuda instead to guarantee that the tensors are on an accelerator device. CPU is not meaningful I suppose.

I tried installing aiter in a nvcr container, but it gives an error:

No ROCm runtime is found, using ROCM_HOME='None' Traceback (most recent call last): File "/diffusers_aiter_backend/aiter/aiter/jit/utils/cpp_extension.py", line 90, in get_hip_version hipconfig = executable_path("hipconfig") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/diffusers_aiter_backend/aiter/aiter/jit/utils/cpp_extension.py", line 83, in executable_path path is not None AssertionError: Could not find hipconfig in PATH or ROCM_HOME(None) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/diffusers_aiter_backend/aiter/setup.py", line 16, in <module> from jit import core File "/diffusers_aiter_backend/aiter/aiter/jit/core.py", line 23, in <module> from chip_info import get_gfx File "/diffusers_aiter_backend/aiter/aiter/jit/utils/chip_info.py", line 9, in <module> from cpp_extension import executable_path File "/diffusers_aiter_backend/aiter/aiter/jit/utils/cpp_extension.py", line 173, in <module> HIP_VERSION = get_hip_version() ^^^^^^^^^^^^^^^^^ File "/diffusers_aiter_backend/aiter/aiter/jit/utils/cpp_extension.py", line 94, in get_hip_version raise RuntimeError("ROCm version file not found") RuntimeError: ROCm version file not found

IMO, this is a corner case that is probably not worth adding a separate function to check for - a user that has managed to install ROCm and CUDA on the same platform but only has one type of accelerator has already taken many wrong turns 😁

Yeah, makes sense 😆

avjves · 2025-10-24T12:31:32Z

tests/others/test_attention_backends.py

    ),
+    (
+        "aiter",
+        torch.tensor([0.0781, 0.0820, 0.0879, 0.0957, 0.0898, 0.0938, 0.0957, 0.0957, 0.2285, 0.2363, 0.2461, 0.2637, 0.2695, 0.2617, 0.2617, 0.2891], dtype=torch.bfloat16),


Was there some way to not run these on NV devices? I don't see any flags here to not run them

Running without aiter installed will lead to xfail on these tests. Trying on MI355X env without aiter skips all tests except native (which fails probably due to the brittleness of the tests; comparing numerical values) and cudnn (which fails because kernel is not available).

=============================================================================================================================== short test summary info ================================================================================================================================ XFAIL tests/others/test_attention_backends.py::test_forward[flash_hub] - Backend 'flash_hub' not supported in this environment. XFAIL tests/others/test_attention_backends.py::test_forward[_flash_3_hub] - Backend '_flash_3_hub' not supported in this environment. XFAIL tests/others/test_attention_backends.py::test_forward[aiter] - Backend 'aiter' not supported in this environment. XFAIL tests/others/test_attention_backends.py::test_forward_with_compile[flash_hub] - Backend 'flash_hub' not supported in this environment. XFAIL tests/others/test_attention_backends.py::test_forward_with_compile[_flash_3_hub] - Backend '_flash_3_hub' not supported in this environment. XFAIL tests/others/test_attention_backends.py::test_forward_with_compile[aiter] - Backend 'aiter' not supported in this environment. FAILED tests/others/test_attention_backends.py::test_forward[native] - assert False FAILED tests/others/test_attention_backends.py::test_forward[_native_cudnn] - RuntimeError: No available kernel. Aborting execution. FAILED tests/others/test_attention_backends.py::test_forward_with_compile[native] - assert False FAILED tests/others/test_attention_backends.py::test_forward_with_compile[_native_cudnn] - torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in function scaled_dot_product_attention>(*(), **{'query': FakeTensor(..., device='cuda:0', size=(1, 24, 384, 128), dtype=torch.bfloat16), 'key': FakeTensor(..., device=... ====================================================================================================================== 4 failed, 6 xfailed, 17 warnings in 45.12s ======================================================================================================================

Ah, I see, then that's not an issue :)

avjves · 2025-10-24T12:40:16Z

Left a couple of questions, but overall LGTM!

…/Chroma1-HD` (huggingface#12508) * [Fix] Move attention mask padding after T5 embedding * [Fix] Move attention mask padding after T5 embedding * Clean up whitespace in pipeline_chroma.py Removed unnecessary blank lines for cleaner code. * Fix * Fix * Update model to final Chroma1-HD checkpoint * Update to Chroma1-HD * Update model to Chroma1-HD * Update model to Chroma1-HD * Update Chroma model links to Chroma1-HD * Add comment about padding/masking * Fix checkpoint/repo references * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <[email protected]>

lauri9 changed the title ~~add aiter attention backend~~ Add support for AITER attention backend Oct 24, 2025

avjves reviewed Oct 24, 2025

View reviewed changes

avjves approved these changes Oct 24, 2025

View reviewed changes

lauri9 force-pushed the add-aiter-backend branch from 0dce2cb to 7482105 Compare October 24, 2025 12:59

lauri9 force-pushed the main branch from 633a4dd to 500b9cf Compare October 27, 2025 09:40

add aiter attention backend

89903c3

lauri9 force-pushed the add-aiter-backend branch from 7482105 to 89903c3 Compare October 27, 2025 09:52

josephrocca and others added 3 commits October 27, 2025 16:25

Merge branch 'main' into add-aiter-backend

570f626

Apply style fixes

3df9fa7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for AITER attention backend #1

Add support for AITER attention backend #1

Uh oh!

lauri9 commented Oct 24, 2025 •

edited

Loading

Uh oh!

avjves Oct 24, 2025

Uh oh!

lauri9 Oct 24, 2025

Uh oh!

avjves Oct 24, 2025

Uh oh!

avjves Oct 24, 2025

Uh oh!

lauri9 Oct 24, 2025

Uh oh!

avjves Oct 24, 2025

Uh oh!

avjves commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Add support for AITER attention backend #1

Are you sure you want to change the base?

Add support for AITER attention backend #1

Uh oh!

Conversation

lauri9 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

avjves Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

lauri9 Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

avjves Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

avjves Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

lauri9 Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

avjves Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

avjves commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lauri9 commented Oct 24, 2025 •

edited

Loading