Skip to content

Conversation

@teresajohnson
Copy link
Contributor

Previously we were attempting to remove the memprof-related metadata
when iterating through instructions in the LTO backend. However, we
missed some as there are a number of cases where we skip instructions,
or even entire functions. Simplify the cleanup and ensure all is removed
by doing a full sweep over all instructions after completing cloning.

This is largely NFC except with -memprof-report-hinted-sizes enabled,
because we were propagating and simplifying the metadata after inlining
in the LTO backend, which caused some stray messages as metadata was
re-converted to attributes.

Previously we were attempting to remove the memprof-related metadata
when iterating through instructions in the LTO backend. However, we
missed some as there are a number of cases where we skip instructions,
or even entire functions. Simplify the cleanup and ensure all is removed
by doing a full sweep over all instructions after completing cloning.

This is largely NFC except with -memprof-report-hinted-sizes enabled,
because we were propagating and simplifying the metadata after inlining
in the LTO backend, which caused some stray messages as metadata was
re-converted to attributes.
@llvmbot llvmbot added LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels Oct 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 19, 2024

@llvm/pr-subscribers-lto

@llvm/pr-subscribers-llvm-transforms

Author: Teresa Johnson (teresajohnson)

Changes

Previously we were attempting to remove the memprof-related metadata
when iterating through instructions in the LTO backend. However, we
missed some as there are a number of cases where we skip instructions,
or even entire functions. Simplify the cleanup and ensure all is removed
by doing a full sweep over all instructions after completing cloning.

This is largely NFC except with -memprof-report-hinted-sizes enabled,
because we were propagating and simplifying the metadata after inlining
in the LTO backend, which caused some stray messages as metadata was
re-converted to attributes.


Full diff: https://github.com/llvm/llvm-project/pull/113039.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp (+16-6)
  • (modified) llvm/test/ThinLTO/X86/memprof-icp.ll (+4-1)
diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
index 5ade0db343f278..17f29dc0d580d3 100644
--- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
@@ -4264,9 +4264,6 @@ bool MemProfContextDisambiguation::applyImport(Module &M) {
           AllocVersionsThinBackend++;
           if (!MaxAllocVersionsThinBackend)
             MaxAllocVersionsThinBackend = 1;
-          // Remove any remaining callsite metadata and we can skip the rest of
-          // the handling for this instruction, since no cloning needed.
-          I.setMetadata(LLVMContext::MD_callsite, nullptr);
           continue;
         }
 
@@ -4419,9 +4416,6 @@ bool MemProfContextDisambiguation::applyImport(Module &M) {
             CloneCallsite(Callsite->second, CB, CalledFunction);
           }
         }
-        // Memprof and callsite metadata on memory allocations no longer needed.
-        I.setMetadata(LLVMContext::MD_memprof, nullptr);
-        I.setMetadata(LLVMContext::MD_callsite, nullptr);
       }
     }
 
@@ -4429,6 +4423,22 @@ bool MemProfContextDisambiguation::applyImport(Module &M) {
     performICP(M, FS->callsites(), VMaps, ICallAnalysisInfo, ORE);
   }
 
+  // We skip some of the functions and instructions above, so remove all the
+  // metadata in a single sweep here.
+  for (auto &F : M) {
+    // We can skip memprof clones because createFunctionClones already strips
+    // the metadata from the newly created clones.
+    if (F.isDeclaration() || isMemProfClone(F))
+      continue;
+    for (auto &BB : F) {
+      for (auto &I : BB) {
+        // Memprof and callsite metadata on memory allocations no longer needed.
+        I.setMetadata(LLVMContext::MD_memprof, nullptr);
+        I.setMetadata(LLVMContext::MD_callsite, nullptr);
+      }
+    }
+  }
+
   return Changed;
 }
 
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index 2e976794425bbe..f17e19e1f77ef2 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -176,7 +176,10 @@
 ; RUN:  -pass-remarks=. -save-temps \
 ; RUN:  -o %t.noicp.out 2>&1 | FileCheck %s --implicit-check-not "created clone"
 
-; RUN: llvm-dis %t.noicp.out.2.4.opt.bc -o - | FileCheck %s --implicit-check-not "_Z3fooR2B0j.memprof"
+;; Verify that we did not do any cloning of the function with the indirect call
+;; when memprof ICP is off. However, we should still have removed the callsite
+;; metadata.
+; RUN: llvm-dis %t.noicp.out.2.4.opt.bc -o - | FileCheck %s --implicit-check-not "_Z3fooR2B0j.memprof" --implicit-check-not "!callsite"
 
 ; REMARKS-MAIN: call in clone main assigned to call function clone _Z3fooR2B0j.memprof.1
 ; REMARKS-MAIN: call in clone main assigned to call function clone _Z3fooR2B0j.memprof.1

if (F.isDeclaration() || isMemProfClone(F))
continue;
for (auto &BB : F) {
for (auto &I : BB) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also filter by CallInst to so that we don't call setMetadata unnecessarily?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@teresajohnson teresajohnson merged commit 120e42d into llvm:main Oct 21, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants