Skip to content

Conversation

@dxqb
Copy link
Collaborator

@dxqb dxqb commented Dec 26, 2025

brings back attention selection, that was removed in 482333f when xformers wasn't used anymore

Can be used to select flash attention, which is faster on windows because torch SDP doesn't support flash internally on Windows

grafik

It can be selected, but it must be manually installed from here: https://github.com/zzlol63/flash-attention-prebuild-wheels/releases

thank you to @zzlol63 for the investigation: #1090

Uses huggingface/diffusers#12892 to raise an error if flash attention cannot be used because of attention masks.

@dxqb dxqb linked an issue Dec 26, 2025 that may be closed by this pull request
@dxqb
Copy link
Collaborator Author

dxqb commented Jan 5, 2026

pre-built flash-attn can also be downloaded from here https://github.com/mjun0812/flash-attention-prebuild-wheels, which might be the better source because they have automated the build process, so more likely will keep it up to date.

the (automated, I guess) versioning there is a bit strange though. It doesn't always update all platforms. This version seems to be the latest release for windows:
https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.19

@Calamdor FYI

@dxqb
Copy link
Collaborator Author

dxqb commented Jan 7, 2026

diffusers have merged the PR used by this PR

@dxqb dxqb marked this pull request as ready for review January 7, 2026 18:22
Updated diffusers package to a specific commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat]: Attention backend selection for Diffusers

1 participant