You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ConSan] Fixes for Warp Specialization support (#8265)
This PR introduces two functional and one performance improvement.
1) Identifying if an allocation is used as a multibuffer is currently
done with IR tracking in ConSan - we check if the allocation is being
used by `subview_index`. There was a missing case, as we were not
tracking values into the WarpSpecializeOp, interpreting multibuferred
allocations as regular ones. This PR fixes this issue.
2) For tcgen5 mma with barrier (so doing commit implicitly) we were
emitting `track_visible_reads/writes` only for tensor core buffers, so
consan was only tracking status of TC buffers accessed by mma op. This
meant for example that if partition A waited for mma issued by partition
B, A still couldn't legally write to mma's operands, even though waiting
for mma to finish should be enough for it to be legal. This is fixed
now.
3) We were emitting `track_visible_reads/writes` after checks of every
operand for mma op. This is expensive, as it consists of number of
global memory accesses. This is now rewritten to emit these ops just
once after all the operands are checked.
With these changes ConSan shows no false positives in the
test_warp_specialization.py tests.
0 commit comments