Skip to content

Conversation

@boomanaiden154
Copy link
Contributor

This patch makes it so that InstrPostProcess::postProcessInstruction takes in a reference to a mca::Instruction rather than a reference to a std::unique_ptr. Without this, InstrPostProcess cannot be used with MCA instruction recycling because it needs to be called on both newly created instructions and instructions that have been recycled. We only have access to a raw pointer for instructions that have been recycled rather than a reference to the std::unique_ptr that owns them.

This patch adds a call in the existing instruction recycling unit test to ensure the API remains compatible with this use case.

@boomanaiden154 boomanaiden154 changed the title [MCA] Use Bare Ference for InstrPostProcess [MCA] Use Bare Reference for InstrPostProcess Sep 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-tools-llvm-mca
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-backend-amdgpu

Author: Aiden Grossman (boomanaiden154)

Changes

This patch makes it so that InstrPostProcess::postProcessInstruction takes in a reference to a mca::Instruction rather than a reference to a std::unique_ptr. Without this, InstrPostProcess cannot be used with MCA instruction recycling because it needs to be called on both newly created instructions and instructions that have been recycled. We only have access to a raw pointer for instructions that have been recycled rather than a reference to the std::unique_ptr that owns them.

This patch adds a call in the existing instruction recycling unit test to ensure the API remains compatible with this use case.


Full diff: https://github.com/llvm/llvm-project/pull/160229.diff

7 Files Affected:

  • (modified) llvm/include/llvm/MCA/CustomBehaviour.h (+1-2)
  • (modified) llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.cpp (+4-4)
  • (modified) llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.h (+2-3)
  • (modified) llvm/lib/Target/X86/MCA/X86CustomBehaviour.cpp (+11-13)
  • (modified) llvm/lib/Target/X86/MCA/X86CustomBehaviour.h (+3-4)
  • (modified) llvm/tools/llvm-mca/llvm-mca.cpp (+1-1)
  • (modified) llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp (+5)
diff --git a/llvm/include/llvm/MCA/CustomBehaviour.h b/llvm/include/llvm/MCA/CustomBehaviour.h
index 0ce3993be95ba..8ad674c4ecf13 100644
--- a/llvm/include/llvm/MCA/CustomBehaviour.h
+++ b/llvm/include/llvm/MCA/CustomBehaviour.h
@@ -49,8 +49,7 @@ class InstrPostProcess {
   /// object after it has been lowered from the MCInst.
   /// This is generally a less disruptive alternative to modifying the
   /// scheduling model.
-  virtual void postProcessInstruction(std::unique_ptr<Instruction> &Inst,
-                                      const MCInst &MCI) {}
+  virtual void postProcessInstruction(Instruction &Inst, const MCInst &MCI) {}
 
   // The resetState() method gets invoked at the beginning of each code region
   // so that targets that override this function can clear any state that they
diff --git a/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.cpp b/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.cpp
index b8f43c4550b7e..afaa19013bfc2 100644
--- a/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.cpp
+++ b/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.cpp
@@ -21,8 +21,8 @@
 
 namespace llvm::mca {
 
-void AMDGPUInstrPostProcess::postProcessInstruction(
-    std::unique_ptr<Instruction> &Inst, const MCInst &MCI) {
+void AMDGPUInstrPostProcess::postProcessInstruction(Instruction &Inst,
+                                                    const MCInst &MCI) {
   switch (MCI.getOpcode()) {
   case AMDGPU::S_WAITCNT:
   case AMDGPU::S_WAITCNT_soft:
@@ -44,7 +44,7 @@ void AMDGPUInstrPostProcess::postProcessInstruction(
 
 // s_waitcnt instructions encode important information as immediate operands
 // which are lost during the MCInst -> mca::Instruction lowering.
-void AMDGPUInstrPostProcess::processWaitCnt(std::unique_ptr<Instruction> &Inst,
+void AMDGPUInstrPostProcess::processWaitCnt(Instruction &Inst,
                                             const MCInst &MCI) {
   for (int Idx = 0, N = MCI.size(); Idx < N; Idx++) {
     MCAOperand Op;
@@ -55,7 +55,7 @@ void AMDGPUInstrPostProcess::processWaitCnt(std::unique_ptr<Instruction> &Inst,
       Op = MCAOperand::createImm(MCOp.getImm());
     }
     Op.setIndex(Idx);
-    Inst->addOperand(Op);
+    Inst.addOperand(Op);
   }
 }
 
diff --git a/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.h b/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.h
index 85b9c188b5d1a..cbc7427ce6cdf 100644
--- a/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.h
+++ b/llvm/lib/Target/AMDGPU/MCA/AMDGPUCustomBehaviour.h
@@ -26,7 +26,7 @@ namespace llvm {
 namespace mca {
 
 class AMDGPUInstrPostProcess : public InstrPostProcess {
-  void processWaitCnt(std::unique_ptr<Instruction> &Inst, const MCInst &MCI);
+  void processWaitCnt(Instruction &Inst, const MCInst &MCI);
 
 public:
   AMDGPUInstrPostProcess(const MCSubtargetInfo &STI, const MCInstrInfo &MCII)
@@ -34,8 +34,7 @@ class AMDGPUInstrPostProcess : public InstrPostProcess {
 
   ~AMDGPUInstrPostProcess() = default;
 
-  void postProcessInstruction(std::unique_ptr<Instruction> &Inst,
-                              const MCInst &MCI) override;
+  void postProcessInstruction(Instruction &Inst, const MCInst &MCI) override;
 };
 
 struct WaitCntInfo {
diff --git a/llvm/lib/Target/X86/MCA/X86CustomBehaviour.cpp b/llvm/lib/Target/X86/MCA/X86CustomBehaviour.cpp
index e2a1bbf383b3c..a69a781bf070b 100644
--- a/llvm/lib/Target/X86/MCA/X86CustomBehaviour.cpp
+++ b/llvm/lib/Target/X86/MCA/X86CustomBehaviour.cpp
@@ -20,24 +20,22 @@
 namespace llvm {
 namespace mca {
 
-void X86InstrPostProcess::setMemBarriers(std::unique_ptr<Instruction> &Inst,
-                                         const MCInst &MCI) {
+void X86InstrPostProcess::setMemBarriers(Instruction &Inst, const MCInst &MCI) {
   switch (MCI.getOpcode()) {
   case X86::MFENCE:
-    Inst->setLoadBarrier(true);
-    Inst->setStoreBarrier(true);
+    Inst.setLoadBarrier(true);
+    Inst.setStoreBarrier(true);
     break;
   case X86::LFENCE:
-    Inst->setLoadBarrier(true);
+    Inst.setLoadBarrier(true);
     break;
   case X86::SFENCE:
-    Inst->setStoreBarrier(true);
+    Inst.setStoreBarrier(true);
     break;
   }
 }
 
-void X86InstrPostProcess::useStackEngine(std::unique_ptr<Instruction> &Inst,
-                                         const MCInst &MCI) {
+void X86InstrPostProcess::useStackEngine(Instruction &Inst, const MCInst &MCI) {
   // TODO(boomanaiden154): We currently do not handle PUSHF/POPF because we
   // have not done the necessary benchmarking to see if they are also
   // optimized by the stack engine.
@@ -46,18 +44,18 @@ void X86InstrPostProcess::useStackEngine(std::unique_ptr<Instruction> &Inst,
   // delay subsequent rsp using non-stack instructions.
   if (X86::isPOP(MCI.getOpcode()) || X86::isPUSH(MCI.getOpcode())) {
     auto *StackRegisterDef =
-        llvm::find_if(Inst->getDefs(), [](const WriteState &State) {
+        llvm::find_if(Inst.getDefs(), [](const WriteState &State) {
           return State.getRegisterID() == X86::RSP;
         });
     assert(
-        StackRegisterDef != Inst->getDefs().end() &&
+        StackRegisterDef != Inst.getDefs().end() &&
         "Expected push instruction to implicitly use stack pointer register.");
-    Inst->getDefs().erase(StackRegisterDef);
+    Inst.getDefs().erase(StackRegisterDef);
   }
 }
 
-void X86InstrPostProcess::postProcessInstruction(
-    std::unique_ptr<Instruction> &Inst, const MCInst &MCI) {
+void X86InstrPostProcess::postProcessInstruction(Instruction &Inst,
+                                                 const MCInst &MCI) {
   // Set IsALoadBarrier and IsAStoreBarrier flags.
   setMemBarriers(Inst, MCI);
   useStackEngine(Inst, MCI);
diff --git a/llvm/lib/Target/X86/MCA/X86CustomBehaviour.h b/llvm/lib/Target/X86/MCA/X86CustomBehaviour.h
index c5459e42dfc9f..d6197f3344bbb 100644
--- a/llvm/lib/Target/X86/MCA/X86CustomBehaviour.h
+++ b/llvm/lib/Target/X86/MCA/X86CustomBehaviour.h
@@ -26,12 +26,12 @@ namespace mca {
 class X86InstrPostProcess : public InstrPostProcess {
   /// Called within X86InstrPostProcess to specify certain instructions
   /// as load and store barriers.
-  void setMemBarriers(std::unique_ptr<Instruction> &Inst, const MCInst &MCI);
+  void setMemBarriers(Instruction &Inst, const MCInst &MCI);
 
   /// Called within X86InstrPostPorcess to remove some rsp read operands
   /// on stack instructions to better simulate the stack engine. We currently
   /// do not model features of the stack engine like sync uops.
-  void useStackEngine(std::unique_ptr<Instruction> &Inst, const MCInst &MCI);
+  void useStackEngine(Instruction &Inst, const MCInst &MCI);
 
 public:
   X86InstrPostProcess(const MCSubtargetInfo &STI, const MCInstrInfo &MCII)
@@ -39,8 +39,7 @@ class X86InstrPostProcess : public InstrPostProcess {
 
   ~X86InstrPostProcess() = default;
 
-  void postProcessInstruction(std::unique_ptr<Instruction> &Inst,
-                              const MCInst &MCI) override;
+  void postProcessInstruction(Instruction &Inst, const MCInst &MCI) override;
 };
 
 } // namespace mca
diff --git a/llvm/tools/llvm-mca/llvm-mca.cpp b/llvm/tools/llvm-mca/llvm-mca.cpp
index a4194da4a7b63..a64539c09b81e 100644
--- a/llvm/tools/llvm-mca/llvm-mca.cpp
+++ b/llvm/tools/llvm-mca/llvm-mca.cpp
@@ -668,7 +668,7 @@ int main(int argc, char **argv) {
         return 1;
       }
 
-      IPP->postProcessInstruction(Inst.get(), MCI);
+      IPP->postProcessInstruction(*Inst.get(), MCI);
       InstToInstruments.insert({&MCI, Instruments});
       LoweredSequence.emplace_back(std::move(Inst.get()));
     }
diff --git a/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp b/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
index 17809e7beda95..e2d8148da5b62 100644
--- a/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
+++ b/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
@@ -130,6 +130,9 @@ TEST_F(X86TestBase, TestInstructionRecycling) {
   mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
   IB.setInstRecycleCallback(GetRecycledInst);
 
+  auto IPP = std::unique_ptr<InstrPostProcess>(
+      TheTarget->createInstrPostProcess(*STI, *MCII));
+
   const SmallVector<mca::Instrument *> Instruments;
   // Tile size = 7
   for (unsigned i = 0U, E = MCIs.size(); i < E;) {
@@ -147,8 +150,10 @@ TEST_F(X86TestBase, TestInstructionRecycling) {
                                        });
         ASSERT_FALSE(bool(RemainingE));
         ASSERT_TRUE(RecycledInst);
+        IPP->postProcessInstruction(*RecycledInst, MCIs[i]);
         ISM.addRecycledInst(RecycledInst);
       } else {
+        IPP->postProcessInstruction(*InstOrErr.get(), MCIs[i]);
         ISM.addInst(std::move(InstOrErr.get()));
       }
     }

This patch makes it so that InstrPostProcess::postProcessInstruction
takes in a reference to a mca::Instruction rather than a reference to a
std::unique_ptr. Without this, InstrPostProcess cannot be used with MCA
instruction recycling because it needs to be called on both newly
created instructions and instructions that have been recycled. We only
have access to a raw pointer for instructions that have been recycled
rather than a reference to the std::unique_ptr that owns them.

This patch adds a call in the existing instruction recycling unit test
to ensure the API remains compatible with this use case.
@boomanaiden154 boomanaiden154 force-pushed the instr-post-process-raw-pointer branch from 4b03216 to fac775a Compare September 23, 2025 13:26
@boomanaiden154 boomanaiden154 enabled auto-merge (squash) September 23, 2025 13:26
@boomanaiden154 boomanaiden154 merged commit a04d3ca into llvm:main Sep 23, 2025
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 23, 2025

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/31721

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
UNSUPPORTED: lldb-shell :: Process/Windows/process_load.cpp (3188 of 3197)
UNSUPPORTED: lldb-shell :: Driver/TestPageZeroRead.test (3189 of 3197)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/independent_state.test (3190 of 3197)
UNSUPPORTED: lldb-shell :: Host/TestCustomShell.test (3191 of 3197)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/interactive_crashlog_legacy.test (3192 of 3197)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/no_threadState.test (3193 of 3197)
PASS: lldb-api :: terminal/TestEditlineCompletions.py (3194 of 3197)
PASS: lldb-api :: commands/process/attach/TestProcessAttach.py (3195 of 3197)
PASS: lldb-api :: repl/clang/TestClangREPL.py (3196 of 3197)
TIMEOUT: lldb-api :: tools/lldb-dap/module/TestDAP_module.py (3197 of 3197)
******************** TEST 'lldb-api :: tools/lldb-dap/module/TestDAP_module.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib --cmake-build-type Release -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/module -p TestDAP_module.py
--
Exit Code: -9
Timeout: Reached timeout of 600 seconds

Command Output (stdout):
--
lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision a04d3cab293087ec940127401a738820b2fa960b)
  clang revision a04d3cab293087ec940127401a738820b2fa960b
  llvm revision a04d3cab293087ec940127401a738820b2fa960b

--
Command Output (stderr):
--
Change dir to: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/module
runCmd: settings clear --all

output: 

runCmd: settings set symbols.enable-external-lookup false

output: 

runCmd: settings set target.inherit-tcc true

output: 

runCmd: settings set target.disable-aslr false

output: 

runCmd: settings set target.detach-on-error false

output: 

runCmd: settings set target.auto-apply-fixits false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants