You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[TUTORIAL] Remove block pointer from fused attention + run in CI (#6839)
This unifies all architectures to now use the `_tma` variant (with the
suffix now removed). When TMA is not natively supported, we use
device-side tensor descriptors which fall back to normal pointer-based
loads.
I also removed the test `cuda/test_flashattention.py` which seems to
just be a clone of the tutorial, and instead run the tutorial itself in
CI.
Also fixes #6242
0 commit comments