Skip to content

Conversation

@preames
Copy link
Collaborator

@preames preames commented Mar 26, 2025

In 236f938, I introduced a generic version of this routine. I believe that the SystemZ specific version of this is less general than the generic version, and is thus unrequired. I wasn't 100% given the difference in sub-register, multiple use and defs, but from the SystemZ code, it looks like those cases simply don't arise?

In 236f938, I introduced a generic version of this routine.  I
believe that the SystemZ specific version of this is less general
than the generic version, and is thus unrequired.  I wasn't 100%
given the difference in sub-register, multiple use and defs, but
from the SystemZ code, it looks like those cases simply don't
arrise?
@llvmbot
Copy link
Member

llvmbot commented Mar 26, 2025

@llvm/pr-subscribers-backend-systemz

Author: Philip Reames (preames)

Changes

In 236f938, I introduced a generic version of this routine. I believe that the SystemZ specific version of this is less general than the generic version, and is thus unrequired. I wasn't 100% given the difference in sub-register, multiple use and defs, but from the SystemZ code, it looks like those cases simply don't arise?


Full diff: https://github.com/llvm/llvm-project/pull/133123.diff

2 Files Affected:

  • (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp (-25)
  • (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.h (-4)
diff --git a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
index 519baec5989df..91a4aa9c73010 100644
--- a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
@@ -670,31 +670,6 @@ void SystemZInstrInfo::insertSelect(MachineBasicBlock &MBB,
     .addImm(CCValid).addImm(CCMask);
 }
 
-MachineInstr *SystemZInstrInfo::optimizeLoadInstr(MachineInstr &MI,
-                                                  const MachineRegisterInfo *MRI,
-                                                  Register &FoldAsLoadDefReg,
-                                                  MachineInstr *&DefMI) const {
-  // Check whether we can move the DefMI load, and that it only has one use.
-  DefMI = MRI->getVRegDef(FoldAsLoadDefReg);
-  assert(DefMI);
-  bool SawStore = false;
-  if (!DefMI->isSafeToMove(SawStore) || !MRI->hasOneNonDBGUse(FoldAsLoadDefReg))
-    return nullptr;
-
-  int UseOpIdx =
-      MI.findRegisterUseOperandIdx(FoldAsLoadDefReg, /*TRI=*/nullptr);
-  assert(UseOpIdx != -1 && "Expected FoldAsLoadDefReg to be used by MI.");
-
-  // Check whether we can fold the load.
-  if (MachineInstr *FoldMI =
-          foldMemoryOperand(MI, {((unsigned)UseOpIdx)}, *DefMI)) {
-    FoldAsLoadDefReg = 0;
-    return FoldMI;
-  }
-
-  return nullptr;
-}
-
 bool SystemZInstrInfo::foldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
                                      Register Reg,
                                      MachineRegisterInfo *MRI) const {
diff --git a/llvm/lib/Target/SystemZ/SystemZInstrInfo.h b/llvm/lib/Target/SystemZ/SystemZInstrInfo.h
index a8d282dd9e417..8b82af61e669a 100644
--- a/llvm/lib/Target/SystemZ/SystemZInstrInfo.h
+++ b/llvm/lib/Target/SystemZ/SystemZInstrInfo.h
@@ -258,10 +258,6 @@ class SystemZInstrInfo : public SystemZGenInstrInfo {
                     const DebugLoc &DL, Register DstReg,
                     ArrayRef<MachineOperand> Cond, Register TrueReg,
                     Register FalseReg) const override;
-  MachineInstr *optimizeLoadInstr(MachineInstr &MI,
-                                  const MachineRegisterInfo *MRI,
-                                  Register &FoldAsLoadDefReg,
-                                  MachineInstr *&DefMI) const override;
   bool foldImmediate(MachineInstr &UseMI, MachineInstr &DefMI, Register Reg,
                      MachineRegisterInfo *MRI) const override;
 

@uweigand
Copy link
Member

I wasn't 100% given the difference in sub-register, multiple use and defs, but from the SystemZ code, it looks like those cases simply don't arise?

Folding sub-register loads was not done by the SystemZ code, but adding that support should be beneficial, I think.

I do wonder about multiple uses - if a loaded value is used in multiple instructions, and some of those instructions are changed to fold in the load, this will increase the total number of memory accesses. That might not be a big deal (value will likely be in L1), but it's not immediately clear either how this is beneficial ... (However, folding to multiple uses within the same instruction likely is beneficial when possible, if there are no further uses outside that single instruction.)

In any case, this is not a platform-specific question, so I'm fine with the custom implementation going away.

@preames preames merged commit b8c86dc into llvm:main Mar 28, 2025
13 checks passed
@preames
Copy link
Collaborator Author

preames commented Mar 28, 2025

I do wonder about multiple uses - if a loaded value is used in multiple instructions, and some of those instructions are changed to fold in the load, this will increase the total number of memory accesses.

The target specific foldMemoryOperandImpl is still in play here, and can chose to avoid cases with multiple operands if desired.

Edit: Note that the previous target specific optimizeLoadInst would only pass the first operand if multiple were potentially foldable. This is suspect correctness wise. The target specific foldMemoryOperandImpl hook will bail out with two or more ops, so this is converting a possible correctness issue into a missed optimization at worst.

@uweigand
Copy link
Member

I do wonder about multiple uses - if a loaded value is used in multiple instructions, and some of those instructions are changed to fold in the load, this will increase the total number of memory accesses.

The target specific foldMemoryOperandImpl is still in play here, and can chose to avoid cases with multiple operands if desired.

Edit: Note that the previous target specific optimizeLoadInst would only pass the first operand if multiple were potentially foldable. This is suspect correctness wise. The target specific foldMemoryOperandImpl hook will bail out with two or more ops, so this is converting a possible correctness issue into a missed optimization at worst.

I agree with the case of multiple operands into the same instruction.

I was concerned about the case of a single load whose value is used by multiple different instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants