-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesSS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET, Full diff: https://github.com/llvm/llvm-project/pull/149894.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index e172c0b63189b..e5d1eaad2b8f4 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -1209,18 +1209,24 @@ void SIFoldOperandsImpl::foldOperand(
return;
}
- // A frame index will resolve to a positive constant, so it should always be
- // safe to fold the addressing mode, even pre-GFX9.
- UseMI->getOperand(UseOpIdx).ChangeToFrameIndex(OpToFold.getFI());
-
const unsigned Opc = UseMI->getOpcode();
if (TII->isFLATScratch(*UseMI) &&
AMDGPU::hasNamedOperand(Opc, AMDGPU::OpName::vaddr) &&
!AMDGPU::hasNamedOperand(Opc, AMDGPU::OpName::saddr)) {
unsigned NewOpc = AMDGPU::getFlatScratchInstSSfromSV(Opc);
+ unsigned CPol =
+ TII->getNamedOperand(*UseMI, AMDGPU::OpName::cpol)->getImm();
+ if ((CPol & AMDGPU::CPol::SCAL) &&
+ !AMDGPU::supportsScaleOffset(*TII, NewOpc))
+ return;
+
UseMI->setDesc(TII->get(NewOpc));
}
+ // A frame index will resolve to a positive constant, so it should always be
+ // safe to fold the addressing mode, even pre-GFX9.
+ UseMI->getOperand(UseOpIdx).ChangeToFrameIndex(OpToFold.getFI());
+
return;
}
diff --git a/llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir b/llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir
new file mode 100644
index 0000000000000..e5955ad3d1d49
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir
@@ -0,0 +1,43 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1250 -start-before=si-fold-operands -stop-after=prologepilog -o - %s | FileCheck -check-prefix=GCN %s
+
+---
+name: test_fold_fi_scratch_load_vgpr
+tracksRegLiveness: true
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: $sgpr32
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 4, alignment: 4 }
+body: |
+ bb.0.entry:
+ ; GCN-LABEL: name: test_fold_fi_scratch_load_vgpr
+ ; GCN: renamable $vgpr0 = SCRATCH_LOAD_DWORD_SADDR $sgpr32, 4, 0, implicit $exec, implicit $flat_scr :: (load (s32) from %stack.0, addrspace 5)
+ ; GCN-NEXT: S_ENDPGM 0, implicit killed renamable $vgpr0
+ %0:vgpr_32 = V_MOV_B32_e32 %stack.0, implicit $exec
+ %1:vgpr_32 = SCRATCH_LOAD_DWORD %0:vgpr_32, 4, 0, implicit $exec, implicit $flat_scr :: (load 4 from %stack.0, addrspace 5)
+ S_ENDPGM 0, implicit %1
+
+...
+
+# SS form of the SCRATCH_LOAD_DWORD does not support offset scaling
+
+---
+name: test_no_fold_fi_scratch_load_vgpr_scale_offset
+tracksRegLiveness: true
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: $sgpr32
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 4, alignment: 4 }
+body: |
+ bb.0.entry:
+ ; GCN-LABEL: name: test_no_fold_fi_scratch_load_vgpr_scale_offset
+ ; GCN: renamable $vgpr0 = V_MOV_B32_e32 $sgpr32, implicit $exec
+ ; GCN-NEXT: renamable $vgpr0 = SCRATCH_LOAD_DWORD killed renamable $vgpr0, 4, 2048, implicit $exec, implicit $flat_scr :: (load (s32) from %stack.0, addrspace 5)
+ ; GCN-NEXT: S_ENDPGM 0, implicit killed renamable $vgpr0
+ %0:vgpr_32 = V_MOV_B32_e32 %stack.0, implicit $exec
+ %1:vgpr_32 = SCRATCH_LOAD_DWORD %0:vgpr_32, 4, 2048, implicit $exec, implicit $flat_scr :: (load 4 from %stack.0, addrspace 5)
+ S_ENDPGM 0, implicit %1
+
+...
|
SS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET, so if this bit is used SCRATCH_LOAD_DWORD_SADDR cannot be formed. This generally shall not happen because FI is not supposed to be scaled, but add this as a precaution.
7b416c7 to
d25d6ba
Compare
1e697f4 to
97a8444
Compare

SS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET,
so if this bit is used SCRATCH_LOAD_DWORD_SADDR cannot be formed.
This generally shall not happen because FI is not supposed to
be scaled, but add this as a precaution.