[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894

rampitec · 2025-07-21T20:17:29Z

SS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET,
so if this bit is used SCRATCH_LOAD_DWORD_SADDR cannot be formed.
This generally shall not happen because FI is not supposed to
be scaled, but add this as a precaution.

rampitec · 2025-07-21T20:18:01Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-07-21T20:18:52Z

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

SS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET,
so if this bit is used SCRATCH_LOAD_DWORD_SADDR cannot be formed.
This generally shall not happen because FI is not supposed to
be scaled, but add this as a precaution.

Full diff: https://github.com/llvm/llvm-project/pull/149894.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+10-4)
(added) llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir (+43)

diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index e172c0b63189b..e5d1eaad2b8f4 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -1209,18 +1209,24 @@ void SIFoldOperandsImpl::foldOperand(
         return;
     }
 
-    // A frame index will resolve to a positive constant, so it should always be
-    // safe to fold the addressing mode, even pre-GFX9.
-    UseMI->getOperand(UseOpIdx).ChangeToFrameIndex(OpToFold.getFI());
-
     const unsigned Opc = UseMI->getOpcode();
     if (TII->isFLATScratch(*UseMI) &&
         AMDGPU::hasNamedOperand(Opc, AMDGPU::OpName::vaddr) &&
         !AMDGPU::hasNamedOperand(Opc, AMDGPU::OpName::saddr)) {
       unsigned NewOpc = AMDGPU::getFlatScratchInstSSfromSV(Opc);
+      unsigned CPol =
+          TII->getNamedOperand(*UseMI, AMDGPU::OpName::cpol)->getImm();
+      if ((CPol & AMDGPU::CPol::SCAL) &&
+          !AMDGPU::supportsScaleOffset(*TII, NewOpc))
+        return;
+
       UseMI->setDesc(TII->get(NewOpc));
     }
 
+    // A frame index will resolve to a positive constant, so it should always be
+    // safe to fold the addressing mode, even pre-GFX9.
+    UseMI->getOperand(UseOpIdx).ChangeToFrameIndex(OpToFold.getFI());
+
     return;
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir b/llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir
new file mode 100644
index 0000000000000..e5955ad3d1d49
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi-gfx1250.mir
@@ -0,0 +1,43 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1250 -start-before=si-fold-operands -stop-after=prologepilog -o - %s | FileCheck -check-prefix=GCN %s
+
+---
+name:            test_fold_fi_scratch_load_vgpr
+tracksRegLiveness: true
+machineFunctionInfo:
+  scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: $sgpr32
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 4, alignment: 4 }
+body:             |
+  bb.0.entry:
+    ; GCN-LABEL: name: test_fold_fi_scratch_load_vgpr
+    ; GCN: renamable $vgpr0 = SCRATCH_LOAD_DWORD_SADDR $sgpr32, 4, 0, implicit $exec, implicit $flat_scr :: (load (s32) from %stack.0, addrspace 5)
+    ; GCN-NEXT: S_ENDPGM 0, implicit killed renamable $vgpr0
+    %0:vgpr_32 = V_MOV_B32_e32 %stack.0, implicit $exec
+    %1:vgpr_32 = SCRATCH_LOAD_DWORD %0:vgpr_32, 4, 0, implicit $exec, implicit $flat_scr :: (load 4 from %stack.0, addrspace 5)
+    S_ENDPGM 0, implicit %1
+
+...
+
+# SS form of the SCRATCH_LOAD_DWORD does not support offset scaling
+
+---
+name:            test_no_fold_fi_scratch_load_vgpr_scale_offset
+tracksRegLiveness: true
+machineFunctionInfo:
+  scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: $sgpr32
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 4, alignment: 4 }
+body:             |
+  bb.0.entry:
+    ; GCN-LABEL: name: test_no_fold_fi_scratch_load_vgpr_scale_offset
+    ; GCN: renamable $vgpr0 = V_MOV_B32_e32 $sgpr32, implicit $exec
+    ; GCN-NEXT: renamable $vgpr0 = SCRATCH_LOAD_DWORD killed renamable $vgpr0, 4, 2048, implicit $exec, implicit $flat_scr :: (load (s32) from %stack.0, addrspace 5)
+    ; GCN-NEXT: S_ENDPGM 0, implicit killed renamable $vgpr0
+    %0:vgpr_32 = V_MOV_B32_e32 %stack.0, implicit $exec
+    %1:vgpr_32 = SCRATCH_LOAD_DWORD %0:vgpr_32, 4, 2048, implicit $exec, implicit $flat_scr :: (load 4 from %stack.0, addrspace 5)
+    S_ENDPGM 0, implicit %1
+
+...

SS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET, so if this bit is used SCRATCH_LOAD_DWORD_SADDR cannot be formed. This generally shall not happen because FI is not supposed to be scaled, but add this as a precaution.

) SS forms of SCRATCH_LOAD_DWORD do not support SCALE_OFFSET, so if this bit is used SCRATCH_LOAD_DWORD_SADDR cannot be formed. This generally shall not happen because FI is not supposed to be scaled, but add this as a precaution.

This was referenced Jul 21, 2025

[AMDGPU] Verify asm VGPR alignment on gfx1250 #149880

Merged

[AMDGPU] MC support for gfx1250 scale_offset modifier #149881

Merged

rampitec requested review from changpeng and shiltian July 21, 2025 20:18

rampitec marked this pull request as ready for review July 21, 2025 20:18

llvmbot added the backend:AMDGPU label Jul 21, 2025

rampitec mentioned this pull request Jul 21, 2025

[AMDGPU] Prohibit load/store merge if scale_offset is set on gfx1250 #149895

Merged

shiltian approved these changes Jul 21, 2025

View reviewed changes

rampitec added 2 commits July 21, 2025 14:24

[AMDGPU] MC support for gfx1250 scale_offset modifier

d25d6ba

rampitec force-pushed the users/rampitec/07-21-_amdgpu_mc_support_for_gfx1250_scale_offset_modifier branch from 7b416c7 to d25d6ba Compare July 21, 2025 21:24

rampitec force-pushed the users/rampitec/07-21-_amdgpu_prevent_folding_of_fi_with_scale_offset_on_gfx1250 branch from 1e697f4 to 97a8444 Compare July 21, 2025 21:24

Base automatically changed from users/rampitec/07-21-_amdgpu_mc_support_for_gfx1250_scale_offset_modifier to main July 21, 2025 22:05

rampitec merged commit 006858c into main Jul 21, 2025
11 of 12 checks passed

rampitec deleted the users/rampitec/07-21-_amdgpu_prevent_folding_of_fi_with_scale_offset_on_gfx1250 branch July 21, 2025 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894

[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894

Uh oh!

rampitec commented Jul 21, 2025

Uh oh!

rampitec commented Jul 21, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jul 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894

[AMDGPU] Prevent folding of FI with scale_offset on gfx1250 #149894

Uh oh!

Conversation

rampitec commented Jul 21, 2025

Uh oh!

rampitec commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jul 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rampitec commented Jul 21, 2025 •

edited

Loading