Skip to content

Conversation

@maksfb
Copy link
Contributor

@maksfb maksfb commented Nov 8, 2025

In addition to tracking offsets inside a BinaryFunction that are referenced by data relocations, we need to track those relocations too. Plus, we will need to map symbols referenced by such relocations back to the containing function.

This change introduces BinaryFunction::InternalRefDataRelocations to track the aforementioned relocations and expands
BinaryContext::SymbolToFunctionMap to include local/temp symbols involved in relocation processing.

There is no functional change introduced that should affect the output. Future PRs will use the new tracking capabilities.

In addition to tracking offsets inside a `BinaryFunction` that are
referenced by data relocations, we need to track those relocations too.
Plus, we will need to map symbols referenced by such relocations back to
the containing function.

This change introduces `BinaryFunction::InternalRefDataRelocations` to
track the aforementioned relocations and expands
`BinaryContext::SymbolToFunctionMap` to include local/temp symbols
involved in relocation processing.

There is no functional change introduced that should affect the output.
Future PRs will use the new tracking capabilities.
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2025

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

In addition to tracking offsets inside a BinaryFunction that are referenced by data relocations, we need to track those relocations too. Plus, we will need to map symbols referenced by such relocations back to the containing function.

This change introduces BinaryFunction::InternalRefDataRelocations to track the aforementioned relocations and expands
BinaryContext::SymbolToFunctionMap to include local/temp symbols involved in relocation processing.

There is no functional change introduced that should affect the output. Future PRs will use the new tracking capabilities.


Full diff: https://github.com/llvm/llvm-project/pull/167074.diff

4 Files Affected:

  • (modified) bolt/include/bolt/Core/BinaryContext.h (+1-1)
  • (modified) bolt/include/bolt/Core/BinaryFunction.h (+28)
  • (modified) bolt/lib/Core/BinaryContext.cpp (+11)
  • (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+4-2)
diff --git a/bolt/include/bolt/Core/BinaryContext.h b/bolt/include/bolt/Core/BinaryContext.h
index 085c0265de3ed..6d3e9d339e99b 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -500,7 +500,7 @@ class BinaryContext {
   ///
   /// As we fold identical functions, multiple symbols can point
   /// to the same BinaryFunction.
-  std::unordered_map<const MCSymbol *, BinaryFunction *> SymbolToFunctionMap;
+  DenseMap<const MCSymbol *, BinaryFunction *> SymbolToFunctionMap;
 
   /// A mutex that is used to control parallel accesses to SymbolToFunctionMap
   mutable llvm::sys::RWMutex SymbolToFunctionMapMutex;
diff --git a/bolt/include/bolt/Core/BinaryFunction.h b/bolt/include/bolt/Core/BinaryFunction.h
index b215a1558cbb4..a720c6af216d7 100644
--- a/bolt/include/bolt/Core/BinaryFunction.h
+++ b/bolt/include/bolt/Core/BinaryFunction.h
@@ -281,6 +281,14 @@ class BinaryFunction {
   /// goto labels.
   std::set<uint64_t> ExternallyReferencedOffsets;
 
+  /// Relocations from data sections targeting internals of this function, i.e.
+  /// some code not at an entry point. These include, but are not limited to,
+  /// jump table relocations and computed goto tables.
+  ///
+  /// Since relocations can be removed/deallocated, we store relocation offsets
+  /// instead of pointers.
+  DenseSet<uint64_t> InternalRefDataRelocations;
+
   /// Offsets of indirect branches with unknown destinations.
   std::set<uint64_t> UnknownIndirectBranchOffsets;
 
@@ -640,6 +648,20 @@ class BinaryFunction {
     Islands->CodeOffsets.emplace(Offset);
   }
 
+  /// Register a relocation from data section referencing code at a non-zero
+  /// offset in this function.
+  void registerInternalRefDataRelocation(uint64_t FuncOffset,
+                                         uint64_t RelOffset) {
+    assert(FuncOffset != 0 && "Relocation should reference function internals");
+    registerReferencedOffset(FuncOffset);
+    InternalRefDataRelocations.insert(RelOffset);
+    const MCSymbol *ReferencedSymbol =
+        getOrCreateLocalLabel(getAddress() + FuncOffset);
+
+    // Track the symbol mapping since it's used in relocation handling.
+    BC.setSymbolToFunctionMap(ReferencedSymbol, this);
+  }
+
   /// Register an internal offset in a function referenced from outside.
   void registerReferencedOffset(uint64_t Offset) {
     ExternallyReferencedOffsets.emplace(Offset);
@@ -1299,6 +1321,12 @@ class BinaryFunction {
   void addRelocation(uint64_t Address, MCSymbol *Symbol, uint32_t RelType,
                      uint64_t Addend, uint64_t Value);
 
+  /// Return locations (offsets) of data section relocations targeting internals
+  /// of this functions.
+  const DenseSet<uint64_t> &getInternalRefDataRelocations() const {
+    return InternalRefDataRelocations;
+  }
+
   /// Return the name of the section this function originated from.
   std::optional<StringRef> getOriginSectionName() const {
     if (!OriginSection)
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index b478925a4d7b7..6a285f5538dbd 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -1520,6 +1520,17 @@ void BinaryContext::foldFunction(BinaryFunction &ChildBF,
   }
   ChildBF.getSymbols().clear();
 
+  // Reset function mapping for local symbols.
+  for (uint64_t RelOffset : ChildBF.getInternalRefDataRelocations()) {
+    const Relocation *Rel = getRelocationAt(RelOffset);
+    if (!Rel || !Rel->Symbol)
+      continue;
+
+    WriteSymbolMapLock.lock();
+    SymbolToFunctionMap[Rel->Symbol] = nullptr;
+    WriteSymbolMapLock.unlock();
+  }
+
   // Move other names the child function is known under.
   llvm::move(ChildBF.Aliases, std::back_inserter(ParentBF.Aliases));
   ChildBF.Aliases.clear();
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index ab3431ef8bd5e..5769577aa3f74 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -2954,8 +2954,10 @@ void RewriteInstance::handleRelocation(const SectionRef &RelocatedSection,
           // if-condition above) so we're handling a relocation from a function
           // to itself. RISC-V uses such relocations for branches, for example.
           // These should not be registered as externally references offsets.
-          if (!ContainingBF)
-            ReferencedBF->registerReferencedOffset(RefFunctionOffset);
+          if (!ContainingBF && !ReferencedBF->isInConstantIsland(Address)) {
+            ReferencedBF->registerInternalRefDataRelocation(RefFunctionOffset,
+                                                            Rel.getOffset());
+          }
         }
         if (opts::Verbosity > 1 &&
             BinarySection(*BC, RelocatedSection).isWritable())

@yozhu
Copy link
Contributor

yozhu commented Nov 8, 2025

LGTM. Thanks!

@maksfb maksfb merged commit af456df into llvm:main Nov 8, 2025
12 checks passed
vinay-deshmukh pushed a commit to vinay-deshmukh/llvm-project that referenced this pull request Nov 8, 2025
In addition to tracking offsets inside a `BinaryFunction` that are
referenced by data relocations, we need to track those relocations too.
Plus, we will need to map symbols referenced by such relocations back to
the containing function.

This change introduces `BinaryFunction::InternalRefDataRelocations` to
track the aforementioned relocations and expands
`BinaryContext::SymbolToFunctionMap` to include local/temp symbols
involved in relocation processing.

There is no functional change introduced that should affect the output.
Future PRs will use the new tracking capabilities.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants