Skip to content

Conversation

@pranavk
Copy link
Contributor

@pranavk pranavk commented Sep 23, 2025

On AArch64, ADRP and its user instructions (LDR, ADD, etc.), that are referencing a GOT symbol, when separated into different functions by machine outliner exposes a correctness issue in the linker ICF. In such cases, user instructions can end up pointing to a folded section (with its canonical folded symbol), while ADRP instruction point to a GOT entry corresponding to the original symbol. This leads to loading from incorrect memory address after ICF. #129122 explains how this can happen in detail.

This addresses #131660 which should fix two things:

  1. Hide the correctness issue described above in the LLVM linker.
  2. Allows optimizations that could relax GOT addressing to PC-relative addressing.

Fixes llvm#131660

Earlier attempts to fix this in the linker were not accepted. Current
attempts is pending at llvm#139493
@llvm llvm deleted a comment from github-actions bot Sep 23, 2025
@pranavk pranavk marked this pull request as ready for review September 23, 2025 22:07
@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Pranav Kant (pranavk)

Changes

Fixes #131660

Earlier attempts to fix this in the linker were not accepted. Current linker attempts is pending at #139493.


Full diff: https://github.com/llvm/llvm-project/pull/160232.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+26-4)
  • (added) llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir (+130)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 5a51c812732e6..8880ca455c1f6 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -10179,11 +10179,33 @@ AArch64InstrInfo::getOutliningTypeImpl(const MachineModuleInfo &MMI,
       return outliner::InstrType::Illegal;
   }
 
-  // Special cases for instructions that can always be outlined, but will fail
-  // the later tests. e.g, ADRPs, which are PC-relative use LR, but can always
-  // be outlined because they don't require a *specific* value to be in LR.
-  if (MI.getOpcode() == AArch64::ADRP)
+  // An ADRP instruction referencing a GOT should not be outlined.
+  // This is to avoid splitting ADRP/(LDR/ADD/etc.) pair into different
+  // functions which can lead to linker ICF merging sections incorrectly.
+  if (MI.getOpcode() == AArch64::ADRP) {
+    bool IsPage = (MI.getOperand(1).getTargetFlags() & AArch64II::MO_PAGE) != 0;
+    bool IsGot = (MI.getOperand(1).getTargetFlags() & AArch64II::MO_GOT) != 0;
+    if (IsPage && IsGot)
+      return outliner::InstrType::Illegal;
+
+    // Special cases for instructions that can always be outlined, but will fail
+    // the later tests. e.g, ADRPs, which are PC-relative use LR, but can always
+    // be outlined because they don't require a *specific* value to be in LR.
     return outliner::InstrType::Legal;
+  }
+
+  // Similarly, any user of ADRP instruction referencing a GOT should not be
+  // outlined. It's hard/costly to check exact users of ADRP. So we use check
+  // all operands and reject any that's a page offset and references a GOT.
+  const auto &F = MI.getMF()->getFunction();
+  for (const auto &MO : MI.operands()) {
+    bool IsPageOff = (MO.getTargetFlags() & AArch64II::MO_PAGEOFF) != 0;
+    bool IsGot = (MO.getTargetFlags() & AArch64II::MO_GOT) != 0;
+    if (IsPageOff && IsGot &&
+        (MI.getMF()->getTarget().getFunctionSections() || F.hasComdat() ||
+         F.hasSection() || F.getSectionPrefix()))
+      return outliner::InstrType::Illegal;
+  }
 
   // If MI is a call we might be able to outline it. We don't want to outline
   // any calls that rely on the position of items on the stack. When we outline
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir b/llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir
new file mode 100644
index 0000000000000..169835809d6ba
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir
@@ -0,0 +1,130 @@
+# RUN: llc -mtriple=aarch64---  -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
+--- |
+
+  @x = common global i32 0, align 4
+
+  define i32 @adrp_add() #0 {
+    ret i32 0
+  }
+
+  define i32 @adrp_ldr() #0 {
+    ret i32 0
+  }
+
+  define void @bar(i32 %a) #0 {
+    ret void
+  }
+
+  attributes #0 = { noinline noredzone }
+...
+---
+# This test ensures that we do not outline ADRP / ADD pair when it's referencing 
+# a GOT entry.
+#
+# CHECK-LABEL: name: adrp_add
+# CHECK-DAG: bb.0:
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+
+# CHECK-DAG: bb.1
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+
+# CHECK-DAG: bb.2
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+name:            adrp_add
+tracksRegLiveness: true
+body:             |
+  bb.0:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+    $lr = ORRXri $xzr, 1
+  bb.1:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+    $lr = ORRXri $xzr, 1
+  bb.2:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+    $lr = ORRXri $xzr, 1
+  bb.3:
+  liveins: $lr
+    RET undef $lr
+...
+---
+# This test ensures that we do not outline ADRP / LDR pair when it's referencing 
+# a GOT entry.
+#
+# CHECK-LABEL: name: adrp_ldr
+# CHECK-DAG: bb.0:
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+
+# CHECK-DAG: bb.1
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+
+# CHECK-DAG: bb.2
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+name:            adrp_ldr
+tracksRegLiveness: true
+body:             |
+  bb.0:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+    $lr = ORRXri $xzr, 1
+  bb.1:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+    $lr = ORRXri $xzr, 1
+  bb.2:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+    $lr = ORRXri $xzr, 1
+  bb.3:
+  liveins: $lr
+    RET undef $lr
\ No newline at end of file

@pranavk pranavk requested review from MaskRay, rnk and smithp35 September 23, 2025 22:08
@fhahn fhahn requested review from aemerson and ornata September 24, 2025 09:25
Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a go at this. I'm not an expert on the outliner, ideally we can get someone who is as a reviewer.

I've made some suggestions based on what I know of the AArch64 instruction set.

Please could you add a full description (for the commit message) it will be really useful to see that in place from git log on a terminal.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. The idea sounds OK to me. Would it be valid to outline so long as we always outlined both instructions (adrp+add, adrp+ldr) together? We fuse the adrp+add so they should generally be scheduled next to one another.

@smithp35
Copy link
Collaborator

Hi. The idea sounds OK to me. Would it be valid to outline so long as we always outlined both instructions (adrp+add, adrp+ldr) together? We fuse the adrp+add so they should generally be scheduled next to one another.

Yes, as long as the adrp+add, adrp+ldr are in the same section, we don't get the problem, as ICF will remove both or none of them. The comment in #129122 (comment) has a good summary of the chain of events that lead to the problem.

@lenary
Copy link
Member

lenary commented Oct 1, 2025

I have been wondering on the RISC-V side if we can add back our ADRP-like instruction to the outlined segment if we find we have only outlined the instruction with the lo operand. This would potentially add overhead to sequences where only the lo instruction has been outlined, and might cause redundant ADRP-likes, but I think it might allow more outlining? I think this should be doable in buildOutlinedFrame, but additional work would also have to be done in getOutliningCandidateInfo to correctly set the FrameOverhead.

I don't know how this approach would cope if the outliner tries to only outline the ADRP-like, rather than the lo instruction. You'd presumably want to replicate the ADRPs from inside the outlined part to immediately after the call to the outlined sequence? That sounds like it might be too much overhead, but the cost model might catch that.

@dtellenbach
Copy link
Member

dtellenbach commented Oct 4, 2025

I'm wondering if it would make sense to do this in getOutliningCandidateInfo and just remove the candidates that split up a adrp, add, ldr sequence. getOutliningTypeImpl is better suited for preventing individual instructions to be considered as part of candidates but doesn't seem a perfect fit if you want to allow the whole sequence to be outlined but not parts of it.

For detecting that a sequence has been split, you can do what @smithp35 mentioned earlier.

If we have in an outline candidate:
add x0, x0, :got_lo12: sym or ldr x0, [x0, :got_lo12:sym] but no preceeding adrp x0 :got: sym in the outline candidate, then we know that an adrp, add or adrp, ldr sequence has been split up as the add x0, x0, :got_lo12: sym and ldr x0, [x0, :got_lo12:sym] don't make any sense on their own.

@dtellenbach dtellenbach self-requested a review October 4, 2025 21:18
@pranavk
Copy link
Contributor Author

pranavk commented Oct 9, 2025

I'm wondering if it would make sense to do this in getOutliningCandidateInfo and just remove the candidates that split up a adrp, add, ldr sequence. getOutliningTypeImpl is better suited for preventing individual instructions to be considered as part of candidates but doesn't seem a perfect fit if you want to allow the whole sequence to be outlined but not parts of it.

getOutliningCandidateInfo certainly sounds like a better place to do what @smithp35 suggested above. This is under the assumption that LDR/ADD always follow ADRP which should generally happen as suggested above. @davemgreen I am curious under what conditions they won't be scheduled together. I'd like to avoid the correctness issue exposed by linker ICF. So it may be worth it to avoid this as well if possible.

@github-actions
Copy link

github-actions bot commented Oct 9, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@dtellenbach dtellenbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@pranavk pranavk merged commit 2f54efd into llvm:main Nov 10, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants