You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[AMD] Always swap operands of mfma and use mfma.transposed layout (#4767)
This helps to improve writeout to use `global_store_dwordx2`.
Along the way this PR
- Fixed the issue with getOrder for mfma layout
- Fixed the issue with reduceOp when dealing with mfma.transposed layout
In general, getOrder and getThreadOrder can return different values, and
this is the case for mfma.transposed layout. Therefore, we shouldn't
assume order and threadOrder are always the same.
0 commit comments