-
Notifications
You must be signed in to change notification settings - Fork 76
[TritonIntelGPUToLLVM] Detect more sub-group shuffle convert_layout
#2573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TritonIntelGPUToLLVM] Detect more sub-group shuffle convert_layout
#2573
Conversation
| int32_t registerInDimSize = conversion->getInDimSize(kRegister); | ||
| int32_t laneInDimSize = conversion->getInDimSize(kLane); | ||
| return conversion->getBases().lookup(kRegister) == | ||
| buildSubGroupTransposeRegisterBases(registerInDimSize, | ||
| laneInDimSize) && | ||
| conversion->getBases().lookup(kLane) == | ||
| buildSubGroupTransposeLaneBases(laneInDimSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor and protected against some illegal layout creation.
|
|
||
| LinearLayout comp = dstLayout.invertAndCompose(srcLayout); | ||
| std::optional<LinearLayout> conversion = comp.divideRight( | ||
| LinearLayout::zeros1D(comp.getInDimSize(kLane), kLane, kLane) * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should always be 0 for this case
| if (!conversion) | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be None is 0 check fails.
Detect sub-group shuffle `convert_layout` cases of more than one element per thread. Signed-off-by: victor-eds <[email protected]>
cb1d9f5 to
a7277b1
Compare
| conversion->getBases().lookup(kRegister) == | ||
| buildSubGroupShuffleRegisterBases(registerInDimSize, | ||
| laneOutDimSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will investigate whether there is a better way to express this in the future when generalizing. Same for transpose case.
mfrancepillois
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing some background on subgroup shuffle, but PR LGTM.
Detect sub-group shuffle
convert_layoutcases of more than one element per thread.