-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU] Add commute for some VOP3 inst #121326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
f0877bb
ae5f1e8
df6903c
ada83d6
c599af0
3a83ae5
5c9d065
d3f00dc
0e60298
707474f
79219d1
288ead2
4becc58
53b370a
0b746a8
785921f
6955a86
5e15e72
004a82d
dc2739f
378c02c
49ae569
161a2b9
4d569ae
1689c1e
328e566
9faf423
19b8ad4
cc3a125
0a89dc9
d8e6cb7
3c7bd89
bf1da57
789092d
6c35280
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,11 @@ | ||
| # RUN: llc -mtriple=amdgcn -mcpu=gfx1030 -run-pass=machine-cse -verify-machineinstrs %s -o - 2>&1 | FileCheck --check-prefix=GCN %s | ||
|
|
||
| # GCN-LABEL: name: test_machine_cse_op_sel | ||
| # GCN: %2:vgpr_32 = V_ADD_NC_U16_e64 0, %0, 0, %1, 1, 0, implicit $mode, implicit $exec | ||
| # GCN: %3:vgpr_32 = V_ADD_NC_U16_e64 0, %1, 0, %0, 1, 0, implicit $mode, implicit $exec | ||
| # GCN: DS_WRITE2_B32_gfx9 undef %4:vgpr_32, %2, %3, 0, 1, 0, implicit $exec | ||
| --- | ||
| name: test_machine_cse_op_sel | ||
| body: | | ||
| ; GCN-LABEL: name: test_machine_cse_op_sel | ||
| ; GCN: %2:vgpr_32 = V_ADD_NC_U16_e64 0, %0, 0, %1, 1, 0, implicit $mode, implicit $exec | ||
| ; GCN-NEXT: DS_WRITE2_B32_gfx9 undef %4:vgpr_32, %2, %2, 0, 1, 0, implicit $exec | ||
| bb.0: | ||
| %0:vgpr_32 = IMPLICIT_DEF | ||
| %1:vgpr_32 = IMPLICIT_DEF | ||
|
|
@@ -15,3 +14,14 @@ body: | | |
| DS_WRITE2_B32_gfx9 undef %4:vgpr_32, %2, %3, 0, 1, 0, implicit $exec | ||
| ... | ||
|
|
||
| --- | ||
| name: test_machine_cse_op_inline_const | ||
| body: | | ||
| ; GCN-LABEL: name: test_machine_cse_op_inline_const | ||
| ; GCN: %0:vgpr_32 = V_ADD_NC_U16_e64 0, 64, 0, -3, 1, 0, implicit $mode, implicit $exec | ||
| ; GCN-NEXT: DS_WRITE2_B32_gfx9 undef %2:vgpr_32, %0, %0, 0, 1, 0, implicit $exec | ||
| bb.0: | ||
| %1:vgpr_32 = V_ADD_NC_U16_e64 0, 64, 0, -3, 1, 0, implicit $mode, implicit $exec | ||
| %2:vgpr_32 = V_ADD_NC_U16_e64 0, -3, 0, 64, 1, 0, implicit $mode, implicit $exec | ||
| DS_WRITE2_B32_gfx9 undef %4:vgpr_32, %1, %2, 0, 1, 0, implicit $exec | ||
| ... | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There should be a dedicated commute test for every changed opcode like this. What about the _e32 forms?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi, for |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't explaining why to do this, but this is also an API flaw that's always been there. We need an isOperandLegal that doesn't account for the context of the other operands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather, one that takes the full set of operands that need to be considered for the result instruction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @arsenm
isOperandLegal also check the literal and constant bus (literal or SGPR) limit, I think maybe it is better to separate them from the function. It will also cause some in-consistency with ISA (e.g. reject instructions with 1 literal and 1 imm) if the OpIdx and MO's index are different
We can do that, but to create the result instruction, we actually swapped the operands on the original instruction too. So if I do that and failed the isOperandLegal check, I would need to swap it back
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @arsenm , I tried to check the swapped instruction with isOperandLegal, it fails 100+ cases. The following is the change:
The reason for that is the VALU instructions with literal in the first input operands.
The
commuteInstructionfunction was called during the shrink instruction pass, and mismatched OpIdx and MO (first operand) will be the parameter of isOperandLegal. It return false because MO was counted for 2 times of literal limit (explained in the other reply).Now if using the swapped instruction, the mismatch no longer exists, then it will move all literal constant to the second input operand for all VALU instructions.
Shall I fix all the cases in this PR?? or fix them in a new issue??