You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Backend] Check if wait_barrier is a constant true (#7508)
When the `wait_barrier` predicate ends up being true, either generated
from Gluon or for some other optimization, generate the non-predicated
version of the instruction, which is slightly more efficient due to 1
less branch. I observed a speedup of like 3 TFLOPS on the attention
kernel because of this. It was stable and reproducible and not just
noise.
0 commit comments