Fix Attention Mask Padding to Ensure Multiple of 8 Alignment #9678

SahilCarterr · 2024-10-15T08:12:56Z

What does this PR do?

Fixes #9637 resolve Attention Mask Padding Issue for Compatibility with xFormers

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed.
@sayakpaul @yiyixuxu

Code from Issue

from diffusers.models.attention_processor import Attention, XFormersAttnProcessor
import torch

# Initialize the attention processor
attn_processer = XFormersAttnProcessor()

# Create the Attention module
attn = Attention(
    query_dim=256,
    heads=8,
    dim_head=64,
    processor=attn_processer,
).to(device="cuda", dtype=torch.bfloat16)

# Create dummy inputs
q = torch.zeros((2, 350, 256), device="cuda", dtype=torch.bfloat16)
kv = torch.zeros((2, 700, 256), device="cuda", dtype=torch.bfloat16)
attn_mask = torch.zeros((2, 1, 700), device="cuda", dtype=torch.bfloat16)

# Perform the attention operation
out = attn(q, kv, attn_mask)

# Print the output shape
print(out.shape)

Output

torch.Size([2, 350, 256])

Hardware Information

GPU: NVIDIA A100
Environment: Google Colab

HuggingFaceDocBuilderDev · 2024-10-15T22:05:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

Thanks for the PR! I left some comments, let me know if it makes sense.

tests/models/unets/test_models_unet_2d_condition.py

sayakpaul · 2024-10-16T02:41:28Z

tests/models/unets/test_models_unet_2d_condition.py

+        attention_mask_shape_before = attention_mask.shape[-1]
+        if attention_mask.dtype == torch.bfloat16 and attention_mask.shape[-1] % 8 != 0:
+            padded_length = math.ceil(attention_mask.shape[-1] / 8) * 8
+            mask = torch.zeros(
+                (attention_mask.shape[0], attention_mask.shape[1], padded_length),
+                device=attention_mask.device,
+                dtype=attention_mask.dtype,
+            )
+            mask[:, :, : attention_mask.shape[-1]] = attention_mask
+            attention_mask = mask
+
+        assert attention_mask.shape[-1] % 8 == 0, "Attention mask not padded to a multiple of 8"
+        assert attention_mask[:, :, :attention_mask_shape_before].equal(
+            attention_mask[:, :, :attention_mask_shape_before]
+        ), "Original values in attention mask are not preserved"
+
+        expanded_attention_mask = attention_mask.expand(-1, query_tokens, -1)
+
+        assert expanded_attention_mask.shape[1] == query_tokens, "Attention mask expansion for query tokens failed"


I think in this test, we want to have a check for functional correctness instead of applying the same logic that we're applying within the attention processor class.

So, this means we could first enable xformers attention on the UNet and then do a forward pass and then design our tests accordingly.

SahilCarterr · 2024-10-17T10:31:49Z

can you help to fix this error when i run the test script RuntimeError: expand(CUDABFloat16Type{[16, 1, 1, 278]}, size=[16, 1, 278]): the number of sizes provided (3) must be greater or equal to the number of dimensions in the tensor (4) . @sayakpaul . i have updated the test

SahilCarterr added 4 commits October 15, 2024 13:13

add pad attention mask and test for mask

57d8e0b

Merge branch 'main' into main

84e267c

Merge branch 'main' into main

de6f81c

Merge branch 'main' into main

9e65a76

a-r-r-o-w requested review from sayakpaul and yiyixuxu October 15, 2024 21:58

Merge branch 'main' into main

fa79062

sayakpaul reviewed Oct 16, 2024

View reviewed changes

SahilCarterr added 2 commits October 17, 2024 07:56

Merge branch 'huggingface:main' into main

6f56ca4

test fix acc to unet

c6e68d6

SahilCarterr added 2 commits October 17, 2024 16:01

Merge branch 'main' into main

3d30670

Merge branch 'huggingface:main' into main

75fa7a8

SahilCarterr closed this by deleting the head repository Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix Attention Mask Padding to Ensure Multiple of 8 Alignment #9678

Fix Attention Mask Padding to Ensure Multiple of 8 Alignment #9678

Uh oh!

SahilCarterr commented Oct 15, 2024 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 15, 2024

Uh oh!

sayakpaul left a comment

Uh oh!

Uh oh!

sayakpaul Oct 16, 2024

Uh oh!

SahilCarterr commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix Attention Mask Padding to Ensure Multiple of 8 Alignment #9678

Fix Attention Mask Padding to Ensure Multiple of 8 Alignment #9678

Uh oh!

Conversation

SahilCarterr commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Code from Issue

Output

Hardware Information

Uh oh!

HuggingFaceDocBuilderDev commented Oct 15, 2024

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sayakpaul Oct 16, 2024

Choose a reason for hiding this comment

Uh oh!

SahilCarterr commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SahilCarterr commented Oct 15, 2024 •

edited

Loading