-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Description
Feature Idea
After multiple months, Sage Attention 3 has been released to the public at https://github.com/thu-ml/SageAttention/tree/main/sageattention3_blackwell.
It employs the hardware-level FP4 cores in Blackwell series GPUs, enabling the attention to proceed x5 times faster than the fastest FlashAttention2 on these GPUs. Comparably to SageAttention2 it's around a x2 times boost.
The caveat is that this precision causes artifacts, especially in the MoE Wan2.2 model (where caching methods such as MagCache and EasyCache also give not great results). Likely it happens because of the MoE timesteps sensitivity, so a Attention Type steps node will probably be an essential addition
Existing Solutions
There is this Pull Request, although it is dated and also needs renaming of the imports #9047. Additionally, Kijai has already implemented Sage Attention 3 in his WanWrapper nodes https://github.com/kijai/ComfyUI-WanVideoWrapper/, with the vanilla Sage Attention 2 running on the first and the last steps
Other
No response