-
Notifications
You must be signed in to change notification settings - Fork 76
[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531
Conversation
…cases Detect basic shuffles and lower to `gpu.shuffle` operations. Basically, support cases in which we go from each work-item having a single tensor element to having `sub_group_size` tensor elements such as element `i` corresponds to the element originally held by work-item `i` in the sub-group. Upstream MLIR pass should handle all integer and floating point types. Drop code handling type legalization for such types when done. Pointer type should still be done in this project. Code should be extended to support other kind of shuffles. Multi-warp case not yet implemented. Signed-off-by: victor-eds <[email protected]>
|
Part of #2266. |
|
Is it urgent for the OKR performance? |
I agree this should be upstreamed. However, it isn't generic enough IMO. I would like to work more on this before upstreaming. I think I would rather have this merged here and have a generic version upstreamed. WDYT? |
Make sense. We can make it general gradually in down stream first. |
Detect basic shuffles and lower to
gpu.shuffleoperations. Basically, support cases in which we go from each work-item having a single tensor element to havingsub_group_sizetensor elements such as elementicorresponds to the element originally held by work-itemiin the sub-group.Upstream MLIR pass should handle all integer and floating point types. Drop code handling type legalization for such types when done. Pointer type should still be done in this project.
Code should be extended to support other kind of shuffles.
Multi-sub-group case not yet implemented.