Skip to content

Conversation

@melver
Copy link
Contributor

@melver melver commented Nov 24, 2025

Unconditionally add AllocTokenPass to the optimization pipelines, and
ensure that it runs last in LTO backend pipelines. The latter ensures
that AllocToken instrumentation can be moved later in the LTO pipeline
to avoid interference with other optimizations (e.g. PGHO) and enable
late heap-allocation optimizations.

In preparation of removing AllocTokenPass being added by Clang, add
support for AllocTokenPass to read configuration options from LLVM
module flags.

To optimize given the pass is now runs unconditionally, only retrieve
TargetLibraryInfo and OptimizationRemarkEmitter when necessary.


This change is part of the following series:

  1. [InstCombine][MemProf] Preserve all metadata #169242
  2. [Clang][MemProf] Add end-to-end test for PGHO rewriting #169243
  3. [LTO][AllocToken] Support AllocToken instrumentation in backend #169358
  4. [Clang][CodeGen] Remove explicit insertion of AllocToken pass #169360

Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
@llvmbot
Copy link
Member

llvmbot commented Nov 24, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-lto

Author: Marco Elver (melver)

Changes

Add support for running AllocTokenPass during the LTO backend phase,
controlled by a new internal option -lto-alloc-token-mode.

This is required to support running the pass after other heap allocation
optimizations (such as PGHO) in the LTO backend, avoiding interference
between them.


This change is part of the following series:

  1. [InstCombine][MemProf] Preserve all metadata #169242
  2. [Clang][MemProf] Add end-to-end test for PGHO rewriting #169243
  3. [LTO][AllocToken] Support AllocToken instrumentation in backend #169358
  4. [Clang] Make -falloc-token-mode a hidden frontend option #169359
  5. [Clang][CodeGen] Remove explicit insertion of AllocToken pass #169360

Full diff: https://github.com/llvm/llvm-project/pull/169358.diff

2 Files Affected:

  • (modified) llvm/lib/LTO/LTOBackend.cpp (+33)
  • (added) llvm/test/LTO/X86/alloc-token.ll (+34)
diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index 93118becedbac..190cec4f5701c 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -31,6 +31,7 @@
 #include "llvm/Passes/PassBuilder.h"
 #include "llvm/Passes/PassPlugin.h"
 #include "llvm/Passes/StandardInstrumentations.h"
+#include "llvm/Support/AllocToken.h"
 #include "llvm/Support/Error.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/MemoryBuffer.h"
@@ -42,6 +43,7 @@
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/TargetParser/SubtargetFeature.h"
 #include "llvm/Transforms/IPO/WholeProgramDevirt.h"
+#include "llvm/Transforms/Instrumentation/AllocToken.h"
 #include "llvm/Transforms/Utils/FunctionImportUtils.h"
 #include "llvm/Transforms/Utils/SplitModule.h"
 #include <optional>
@@ -68,6 +70,10 @@ static cl::opt<LTOBitcodeEmbedding> EmbedBitcode(
                           "Embed post merge, but before optimizations")),
     cl::desc("Embed LLVM bitcode in object files produced by LTO"));
 
+static cl::opt<std::string> LTOAllocTokenMode(
+    "lto-alloc-token-mode", cl::init(""),
+    cl::desc("Enable AllocToken instrumentation during LTO with chosen mode"));
+
 static cl::opt<bool> ThinLTOAssumeMerged(
     "thinlto-assume-merged", cl::init(false),
     cl::desc("Assume the input has already undergone ThinLTO function "
@@ -198,6 +204,31 @@ static void RegisterPassPlugins(ArrayRef<std::string> PassPlugins,
   }
 }
 
+// Register instrumentation passes that need to run late in the pipeline; these
+// are non-optimization passes and need to run after most optimizations to avoid
+// interfering with them (e.g. PGHO) or to capture the final state of the code.
+static void registerBackendInstrumentation(PassBuilder &PB) {
+  if (!LTOAllocTokenMode.empty()) {
+    AllocTokenOptions Opts;
+    if (auto Mode = getAllocTokenModeFromString(LTOAllocTokenMode))
+      Opts.Mode = *Mode;
+    else
+      report_fatal_error("invalid lto-alloc-token-mode: " +
+                         Twine(LTOAllocTokenMode));
+
+    // ThinLTO backend
+    PB.registerOptimizerLastEPCallback(
+        [Opts](ModulePassManager &MPM, OptimizationLevel, ThinOrFullLTOPhase) {
+          MPM.addPass(AllocTokenPass(Opts));
+        });
+    // Full LTO backend
+    PB.registerFullLinkTimeOptimizationLastEPCallback(
+        [Opts](ModulePassManager &MPM, OptimizationLevel) {
+          MPM.addPass(AllocTokenPass(Opts));
+        });
+  }
+}
+
 static std::unique_ptr<TargetMachine>
 createTargetMachine(const Config &Conf, const Target *TheTarget, Module &M) {
   const Triple &TheTriple = M.getTargetTriple();
@@ -277,6 +308,8 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM,
 
   RegisterPassPlugins(Conf.PassPlugins, PB);
 
+  registerBackendInstrumentation(PB);
+
   std::unique_ptr<TargetLibraryInfoImpl> TLII(
       new TargetLibraryInfoImpl(TM->getTargetTriple()));
   if (Conf.Freestanding)
diff --git a/llvm/test/LTO/X86/alloc-token.ll b/llvm/test/LTO/X86/alloc-token.ll
new file mode 100644
index 0000000000000..873b7c94620bc
--- /dev/null
+++ b/llvm/test/LTO/X86/alloc-token.ll
@@ -0,0 +1,34 @@
+; RUN: llvm-as %s -o %t.bc
+;
+; RUN: llvm-lto2 run -lto-alloc-token-mode=default %t.bc -o %t.out \
+; RUN:   -r=%t.bc,main,plx \
+; RUN:   -r=%t.bc,_Znwm, \
+; RUN:   -r=%t.bc,sink,pl
+; RUN: llvm-objdump -d -r %t.out.0 | FileCheck %s --check-prefixes=CHECK,DEFAULT
+;
+; RUN: llvm-lto2 run -lto-alloc-token-mode=default -alloc-token-fast-abi -alloc-token-max=1 %t.bc -o %t.out \
+; RUN:   -r=%t.bc,main,plx \
+; RUN:   -r=%t.bc,_Znwm, \
+; RUN:   -r=%t.bc,sink,pl
+; RUN: llvm-objdump -d -r %t.out.0 | FileCheck %s --check-prefixes=CHECK,FASTABI
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare ptr @_Znwm(i64) #0
+
+@sink = global ptr null
+
+; CHECK-LABEL: <main>:
+; CHECK: callq
+; DEFAULT-NEXT: R_X86_64_PLT32 __alloc_token__Znwm
+; FASTABI-NEXT: R_X86_64_PLT32 __alloc_token_0__Znwm
+define void @main() sanitize_alloc_token {
+  %call = call ptr @_Znwm(i64 8) #0, !alloc_token !0
+  store volatile ptr %call, ptr @sink
+  ret void
+}
+
+attributes #0 = { nobuiltin allocsize(0) }
+
+!0 = !{!"int", i1 0}

@vitalybuka vitalybuka requested a review from pcc November 24, 2025 21:00
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
@melver melver changed the base branch from users/melver/spr/main.lto-support-enabling-alloctoken-instrumentation-in-backend to main November 25, 2025 16:22
Created using spr 1.3.8-beta.1
@melver melver changed the title [LTO] Support enabling AllocToken instrumentation in backend [LTO][AllocToken] Support AllocToken instrumentation in backend Nov 26, 2025
@melver melver requested a review from teresajohnson November 26, 2025 15:57
Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add the alloc token metadata, and/or whatever else is needed to cause the allocator calls to kick in, to at least one of the operator new calls in llvm/test/Transforms/InstCombine/simplify-libcalls-new.ll (or create a new small test that has this), and invoke various pipelines on it via opt to ensure we still get the expected transformations (to hot cold operator new and subsequently the alloc token version)? E.g. opt -passes='thinlto<O2>' and opt -passes='lto<O2>' ?

Created using spr 1.3.8-beta.1
@llvmbot llvmbot added backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. labels Nov 27, 2025
@melver melver requested a review from teresajohnson November 27, 2025 15:04
@teresajohnson
Copy link
Contributor

Could you also add the alloc token metadata, and/or whatever else is needed to cause the allocator calls to kick in, to at least one of the operator new calls in llvm/test/Transforms/InstCombine/simplify-libcalls-new.ll (or create a new small test that has this), and invoke various pipelines on it via opt to ensure we still get the expected transformations (to hot cold operator new and subsequently the alloc token version)? E.g. opt -passes='thinlto<O2>' and opt -passes='lto<O2>' ?

Largely lgtm to me, except I had a comment about simplifying the cl::opt checks. Also, please add a change to an existing memprof test like suggested above - this will ensure the phase ordering stays correct.

Created using spr 1.3.8-beta.1
@melver
Copy link
Contributor Author

melver commented Nov 27, 2025

Could you also add the alloc token metadata, and/or whatever else is needed to cause the allocator calls to kick in, to at least one of the operator new calls in llvm/test/Transforms/InstCombine/simplify-libcalls-new.ll (or create a new small test that has this), and invoke various pipelines on it via opt to ensure we still get the expected transformations (to hot cold operator new and subsequently the alloc token version)? E.g. opt -passes='thinlto<O2>' and opt -passes='lto<O2>' ?

Largely lgtm to me, except I had a comment about simplifying the cl::opt checks.

Done.

Also, please add a change to an existing memprof test like suggested above - this will ensure the phase ordering stays correct.

Sorry, missed that.
Added 2 new tests - it seems that we need -supports-hot-cold-new and the index so the LTO pipeline doesn't remove the hot/cold hints, correct? I added a dedicated test to the AllocToken suite and another in the LTO/X86 suite, the latter of which should ensure we catch any deeper regressions.

PTAL.

@melver melver requested a review from teresajohnson November 27, 2025 16:12
@teresajohnson
Copy link
Contributor

Also, please add a change to an existing memprof test like suggested above - this will ensure the phase ordering stays correct.

Sorry, missed that. Added 2 new tests - it seems that we need -supports-hot-cold-new and the index so the LTO pipeline doesn't remove the hot/cold hints, correct? I added a dedicated test to the AllocToken suite and another in the LTO/X86 suite, the latter of which should ensure we catch any deeper regressions.

Ah, that is a complication, I should put that pass under an internal option so it could be disabled for testing, but what you did here lgtm.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants