-
Notifications
You must be signed in to change notification settings - Fork 15.3k
AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 #102345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesSome pointer adds get turned into ors, and sometimes and is Full diff: https://github.com/llvm/llvm-project/pull/102345.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index e8c2cbd3dd671..76da1f0eb4f7d 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -2268,8 +2268,9 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
MI->eraseFromParent();
return true;
}
- case AMDGPU::S_ADD_I32: {
- // TODO: Handle s_or_b32, s_and_b32.
+ case AMDGPU::S_ADD_I32:
+ case AMDGPU::S_OR_B32:
+ case AMDGPU::S_AND_B32: {
MachineOperand &OtherOp = MI->getOperand(FIOperandNum == 1 ? 2 : 1);
assert(FrameReg || MFI->isBottomOfStack());
diff --git a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir
index 1456bbc369b6a..f7f43b59085de 100644
--- a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir
+++ b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir
@@ -21,21 +21,21 @@ machineFunctionInfo:
body: |
bb.0:
; MUBUFW64-LABEL: name: s_or_b32__inline_imm__fi_offset0
- ; MUBUFW64: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 12, killed $sgpr4, implicit-def $scc
+ ; MUBUFW64: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 12, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_or_b32__inline_imm__fi_offset0
- ; MUBUFW32: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 12, killed $sgpr4, implicit-def $scc
+ ; MUBUFW32: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 12, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_or_b32__inline_imm__fi_offset0
- ; FLATSCRW64: renamable $sgpr7 = S_OR_B32 12, $sgpr32, implicit-def $scc
+ ; FLATSCRW64: renamable $sgpr7 = S_OR_B32 $sgpr32, 12, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_or_b32__inline_imm__fi_offset0
- ; FLATSCRW32: renamable $sgpr7 = S_OR_B32 12, $sgpr32, implicit-def $scc
+ ; FLATSCRW32: renamable $sgpr7 = S_OR_B32 $sgpr32, 12, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_OR_B32 12, %stack.0, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -55,25 +55,21 @@ machineFunctionInfo:
body: |
bb.0:
; MUBUFW64-LABEL: name: s_or_b32__literal__fi_offset96
- ; MUBUFW64: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc
- ; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 96, implicit-def $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc
+ ; MUBUFW64: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 164, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_or_b32__literal__fi_offset96
- ; MUBUFW32: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc
- ; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 96, implicit-def $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc
+ ; MUBUFW32: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 164, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_or_b32__literal__fi_offset96
- ; FLATSCRW64: $sgpr4 = S_ADD_I32 $sgpr32, 96, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW64: renamable $sgpr7 = S_OR_B32 $sgpr32, 164, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_or_b32__literal__fi_offset96
- ; FLATSCRW32: $sgpr4 = S_ADD_I32 $sgpr32, 96, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW32: renamable $sgpr7 = S_OR_B32 $sgpr32, 164, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_OR_B32 68, %stack.1, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -96,31 +92,31 @@ body: |
; MUBUFW64-LABEL: name: s_or_b32__sgpr__fi_literal_offset
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc
- ; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 80, implicit-def $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 80, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_or_b32__sgpr__fi_literal_offset
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc
- ; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 80, implicit-def $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 80, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_or_b32__sgpr__fi_literal_offset
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 80, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_or_b32__sgpr__fi_literal_offset
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 80, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_OR_B32 $sgpr8, %stack.1, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -143,31 +139,31 @@ body: |
; MUBUFW64-LABEL: name: s_or_b32__sgpr__fi_inlineimm_offset
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc
- ; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 32, implicit-def $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 32, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_or_b32__sgpr__fi_inlineimm_offset
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc
- ; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 32, implicit-def $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 32, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_or_b32__sgpr__fi_inlineimm_offset
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 32, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 32, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_or_b32__sgpr__fi_inlineimm_offset
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 32, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr7, 32, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_OR_B32 $sgpr8, %stack.1, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
@@ -190,31 +186,31 @@ body: |
; MUBUFW64-LABEL: name: s_and_b32__sgpr__fi_literal_offset
; MUBUFW64: liveins: $sgpr8
; MUBUFW64-NEXT: {{ $}}
- ; MUBUFW64-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc
- ; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 80, implicit-def $scc
- ; MUBUFW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc
; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; MUBUFW32-LABEL: name: s_and_b32__sgpr__fi_literal_offset
; MUBUFW32: liveins: $sgpr8
; MUBUFW32-NEXT: {{ $}}
- ; MUBUFW32-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc
- ; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 80, implicit-def $scc
- ; MUBUFW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc
+ ; MUBUFW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc
; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW64-LABEL: name: s_and_b32__sgpr__fi_literal_offset
; FLATSCRW64: liveins: $sgpr8
; FLATSCRW64-NEXT: {{ $}}
- ; FLATSCRW64-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc
- ; FLATSCRW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc
; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
;
; FLATSCRW32-LABEL: name: s_and_b32__sgpr__fi_literal_offset
; FLATSCRW32: liveins: $sgpr8
; FLATSCRW32-NEXT: {{ $}}
- ; FLATSCRW32-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc
- ; FLATSCRW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc
+ ; FLATSCRW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc
; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc
renamable $sgpr7 = S_AND_B32 $sgpr8, %stack.1, implicit-def $scc
SI_RETURN implicit $sgpr7, implicit $scc
|
| ; FLATSCRW32-NEXT: {{ $}} | ||
| ; FLATSCRW32-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc | ||
| ; FLATSCRW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc | ||
| ; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand this. The transformation is (s8 & (sp + 80)) ->((s8 + sp) & 80) does not look immediately obvious.
8726c2b to
b9a85ce
Compare
|
For the or case, I think we need to check the disjoint flag on the instruction (and also transform the final or into an add, if we can fold the offsets) |
Some pointer adds get turned into ors, and sometimes and is performed on pointers for masking.
ac17eed to
f811338
Compare
| case AMDGPU::S_ADD_U32: { | ||
| // TODO: Handle s_or_b32, s_and_b32. | ||
| case AMDGPU::S_OR_B32: { | ||
| // TODO: Handle s_and_b32 for ptrmask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not fully understand how can the Offset be folded into the other operand the same way as add:
OtherOp.setImm(OtherOp.getImm() + Offset);

Some pointer adds get turned into ors, and sometimes and is
performed on pointers for masking.