Skip to content

Conversation

@rampitec
Copy link
Collaborator

Both methods are equally inaccurate, we need to switch to MCExpr
for better results in the future.

Copy link
Collaborator Author

rampitec commented Feb 14, 2025

@llvmbot
Copy link
Member

llvmbot commented Feb 14, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

Both methods are equally inaccurate, we need to switch to MCExpr
for better results in the future.


Full diff: https://github.com/llvm/llvm-project/pull/127246.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/SIProgramInfo.cpp (+4-22)
  • (modified) llvm/lib/Target/AMDGPU/SIProgramInfo.h (+1-1)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index baacb5d3d5455..08776ab525d6e 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -8978,7 +8978,7 @@ unsigned SIInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
     return getInlineAsmLength(AsmStr, *MF->getTarget().getMCAsmInfo(), &ST);
   }
   default:
-    if (MI.isMetaInstruction())
+    if (MI.isMetaInstruction() || MI.isDebugInstr())
       return 0;
     return DescSize;
   }
diff --git a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
index 5179288084010..7169eebf907ca 100644
--- a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
@@ -202,27 +202,9 @@ const MCExpr *SIProgramInfo::getPGMRSrc2(CallingConv::ID CC,
   return MCConstantExpr::create(0, Ctx);
 }
 
-uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) {
-  if (CodeSizeInBytes.has_value())
-    return *CodeSizeInBytes;
+uint64_t SIProgramInfo::getFunctionCodeSize(MachineFunction &MF) {
+  if (!CodeSizeInBytes.has_value())
+    CodeSizeInBytes = MF.estimateFunctionSizeInBytes();
 
-  const GCNSubtarget &STM = MF.getSubtarget<GCNSubtarget>();
-  const SIInstrInfo *TII = STM.getInstrInfo();
-
-  uint64_t CodeSize = 0;
-
-  for (const MachineBasicBlock &MBB : MF) {
-    for (const MachineInstr &MI : MBB) {
-      // TODO: CodeSize should account for multiple functions.
-
-      // TODO: Should we count size of debug info?
-      if (MI.isDebugInstr())
-        continue;
-
-      CodeSize += TII->getInstSizeInBytes(MI);
-    }
-  }
-
-  CodeSizeInBytes = CodeSize;
-  return CodeSize;
+  return *CodeSizeInBytes;
 }
diff --git a/llvm/lib/Target/AMDGPU/SIProgramInfo.h b/llvm/lib/Target/AMDGPU/SIProgramInfo.h
index d7087436ae758..65f8bee1c5118 100644
--- a/llvm/lib/Target/AMDGPU/SIProgramInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIProgramInfo.h
@@ -101,7 +101,7 @@ struct LLVM_EXTERNAL_VISIBILITY SIProgramInfo {
   void reset(const MachineFunction &MF);
 
   // Get function code size and cache the value.
-  uint64_t getFunctionCodeSize(const MachineFunction &MF);
+  uint64_t getFunctionCodeSize(MachineFunction &MF);
 
   /// Compute the value of the ComputePGMRsrc1 register.
   const MCExpr *getComputePGMRSrc1(const GCNSubtarget &ST,

}
default:
if (MI.isMetaInstruction())
if (MI.isMetaInstruction() || MI.isDebugInstr())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect DescSize to be correct for debug insts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure downstream staging branch has adequate testing for this assumption.

Base automatically changed from users/rampitec/02-13-_amdgpu_move_into_siprograminfo_and_cache_getfunctioncodesize._nfci to main February 18, 2025 02:22
Both methods are equally inaccurate, we need to switch to MCExpr
for better results in the future.
@rampitec rampitec force-pushed the users/rampitec/02-14-_amdgpu_switch_to_mf.estimatefunctionsizeinbytes_ branch from 99b5a59 to a35a50c Compare February 18, 2025 21:44
# CHECK: .p2align 5
# CHECK: s_endpgm ; encoding: [0x00,0x00,0x81,0xbf]
# CHECK: ; codeLenInByte = 36
# CHECK: ; codeLenInByte = 64
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is rebased. Now it shows how this method can largely overestimate code size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants