Skip to content

Conversation

pranavk
Copy link
Contributor

@pranavk pranavk commented Sep 23, 2025

On AArch64, ADRP and its user instructions (LDR, ADD, etc.), that are referencing a GOT symbol, when separated into different functions by machine outliner exposes a correctness issue in the linker ICF. In such cases, user instructions can end up pointing to a folded section (with its canonical folded symbol), while ADRP instruction point to a GOT entry corresponding to the original symbol. This leads to loading from incorrect memory address after ICF. #129122 explains how this can happen in detail.

This addresses #131660 which should fix two things:

  1. Hide the correctness issue described above in the LLVM linker.
  2. Allows optimizations that could relax GOT addressing to PC-relative addressing.

Fixes llvm#131660

Earlier attempts to fix this in the linker were not accepted. Current
attempts is pending at llvm#139493
@llvm llvm deleted a comment from github-actions bot Sep 23, 2025
@pranavk pranavk marked this pull request as ready for review September 23, 2025 22:07
@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Pranav Kant (pranavk)

Changes

Fixes #131660

Earlier attempts to fix this in the linker were not accepted. Current linker attempts is pending at #139493.


Full diff: https://github.com/llvm/llvm-project/pull/160232.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+26-4)
  • (added) llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir (+130)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 5a51c812732e6..8880ca455c1f6 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -10179,11 +10179,33 @@ AArch64InstrInfo::getOutliningTypeImpl(const MachineModuleInfo &MMI,
       return outliner::InstrType::Illegal;
   }
 
-  // Special cases for instructions that can always be outlined, but will fail
-  // the later tests. e.g, ADRPs, which are PC-relative use LR, but can always
-  // be outlined because they don't require a *specific* value to be in LR.
-  if (MI.getOpcode() == AArch64::ADRP)
+  // An ADRP instruction referencing a GOT should not be outlined.
+  // This is to avoid splitting ADRP/(LDR/ADD/etc.) pair into different
+  // functions which can lead to linker ICF merging sections incorrectly.
+  if (MI.getOpcode() == AArch64::ADRP) {
+    bool IsPage = (MI.getOperand(1).getTargetFlags() & AArch64II::MO_PAGE) != 0;
+    bool IsGot = (MI.getOperand(1).getTargetFlags() & AArch64II::MO_GOT) != 0;
+    if (IsPage && IsGot)
+      return outliner::InstrType::Illegal;
+
+    // Special cases for instructions that can always be outlined, but will fail
+    // the later tests. e.g, ADRPs, which are PC-relative use LR, but can always
+    // be outlined because they don't require a *specific* value to be in LR.
     return outliner::InstrType::Legal;
+  }
+
+  // Similarly, any user of ADRP instruction referencing a GOT should not be
+  // outlined. It's hard/costly to check exact users of ADRP. So we use check
+  // all operands and reject any that's a page offset and references a GOT.
+  const auto &F = MI.getMF()->getFunction();
+  for (const auto &MO : MI.operands()) {
+    bool IsPageOff = (MO.getTargetFlags() & AArch64II::MO_PAGEOFF) != 0;
+    bool IsGot = (MO.getTargetFlags() & AArch64II::MO_GOT) != 0;
+    if (IsPageOff && IsGot &&
+        (MI.getMF()->getTarget().getFunctionSections() || F.hasComdat() ||
+         F.hasSection() || F.getSectionPrefix()))
+      return outliner::InstrType::Illegal;
+  }
 
   // If MI is a call we might be able to outline it. We don't want to outline
   // any calls that rely on the position of items on the stack. When we outline
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir b/llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir
new file mode 100644
index 0000000000000..169835809d6ba
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-adrp-got-split.mir
@@ -0,0 +1,130 @@
+# RUN: llc -mtriple=aarch64---  -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
+--- |
+
+  @x = common global i32 0, align 4
+
+  define i32 @adrp_add() #0 {
+    ret i32 0
+  }
+
+  define i32 @adrp_ldr() #0 {
+    ret i32 0
+  }
+
+  define void @bar(i32 %a) #0 {
+    ret void
+  }
+
+  attributes #0 = { noinline noredzone }
+...
+---
+# This test ensures that we do not outline ADRP / ADD pair when it's referencing 
+# a GOT entry.
+#
+# CHECK-LABEL: name: adrp_add
+# CHECK-DAG: bb.0:
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+
+# CHECK-DAG: bb.1
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+
+# CHECK-DAG: bb.2
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+name:            adrp_add
+tracksRegLiveness: true
+body:             |
+  bb.0:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+    $lr = ORRXri $xzr, 1
+  bb.1:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+    $lr = ORRXri $xzr, 1
+  bb.2:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-got) @x, 0
+    $lr = ORRXri $xzr, 1
+  bb.3:
+  liveins: $lr
+    RET undef $lr
+...
+---
+# This test ensures that we do not outline ADRP / LDR pair when it's referencing 
+# a GOT entry.
+#
+# CHECK-LABEL: name: adrp_ldr
+# CHECK-DAG: bb.0:
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+
+# CHECK-DAG: bb.1
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+
+# CHECK-DAG: bb.2
+# CHECK: $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+# CHECK: $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+name:            adrp_ldr
+tracksRegLiveness: true
+body:             |
+  bb.0:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+    $lr = ORRXri $xzr, 1
+  bb.1:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+    $lr = ORRXri $xzr, 1
+  bb.2:
+  liveins: $lr
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $w12 = ORRWri $wzr, 1
+    $x9 = ADRP target-flags(aarch64-page, aarch64-got) @x
+    $x12 = LDRXui $x9, target-flags(aarch64-pageoff, aarch64-got) @x
+    $lr = ORRXri $xzr, 1
+  bb.3:
+  liveins: $lr
+    RET undef $lr
\ No newline at end of file

@pranavk pranavk requested review from MaskRay, rnk and smithp35 September 23, 2025 22:08
@fhahn fhahn requested review from aemerson and ornata September 24, 2025 09:25
Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a go at this. I'm not an expert on the outliner, ideally we can get someone who is as a reviewer.

I've made some suggestions based on what I know of the AArch64 instruction set.

Please could you add a full description (for the commit message) it will be really useful to see that in place from git log on a terminal.

bool IsPageOff = (MO.getTargetFlags() & AArch64II::MO_PAGEOFF) != 0;
bool IsGot = (MO.getTargetFlags() & AArch64II::MO_GOT) != 0;
if (IsPageOff && IsGot &&
(MI.getMF()->getTarget().getFunctionSections() || F.hasComdat() ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above says:

So we use check all operands and reject any that's a page offset and references a GOT.

This part seems to further constrain to only functions that are, roughly speaking, compiled in their own section.

However the ADRP is not constrained. Can you explain why this is needed? If there is a good reason, I think we should have a test case.

Could we end up with a case where there is an attempt to outline an ADRP, ADD sequence from .text where the ADRP is illegal, but the ADD is not, and this causes problems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. My code was partially motivated by @lenary's code pointer to RISCV backend here: #131660 (comment)

Maybe I am missing something there but now that I look at it again, I agree that this should just be

if (isPageOff && IsGot) { ... }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional conditions are to detect the cases where the outlined function will be in a separate section to the original function. If you're outlining from a function to create a function in the same section, then you shouldn't need to worry about ICF (which only works with -ffunction-sections, iirc). This is the approach we made on RISC-V, but I note that ADRP does work slightly differently to AUIPC.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional context. I had made the assumption that the outliner could always create a new section when outlining, but if it is in the same section then I agree we won't see the ICF problem.

I think it would be worth a quick comment to explain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outliner creates new functions (there will be a little LLVM IR function stub for them, because there has to be, but the MachineFunction contents is the important bit), which might inherit section-related metadata, but the section creation still happens as usual in the AsmPrinter, which is at the end of the codegen pipeline.

return outliner::InstrType::Legal;
}

// Similarly, any user of ADRP instruction referencing a GOT should not be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can constrain this to just the opcodes that can follow an ADRP in a GOT sequence. There should be only one ADD and one LDR opcode.

The tiny code-model can use a single pc-relative load to the GOT which could be outlined. This is a different encoding to the base relative ldr used after an ADRP in the small code model.

// tiny
ldr     x0, :got:var
// small
adrp x0, :got:var
ldr x0, [x0 :got_lo12: var]

It is possible that there is a different flag to MO_GOT used though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address this in next comment.

// the later tests. e.g, ADRPs, which are PC-relative use LR, but can always
// be outlined because they don't require a *specific* value to be in LR.
if (MI.getOpcode() == AArch64::ADRP)
// An ADRP instruction referencing a GOT should not be outlined.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, this will prevent a sequence of ADRP, ADD or ADRP, LDR from being outlined as a whole sequence. Do we have any data on whether that significantly affects the effectiveness of the outliner?

I'm not familiar with the outliner code so I'm not sure I can offer a direct alternative. I can see some code that evaluates the RepeatedSequences. Is there an alternative that scans the sequences looking for LDR or ADD instructions with MO_GOT that appear before an ADRP with a MO_PAGE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an alternative that scans the sequences looking for LDR or ADD instructions with MO_GOT that appear before an ADRP with a MO_PAGE.

You mean that appears after an ADRP with MO_PAGE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @davemgreen as this relates to his comment as well.

I wasn't sure ADRP and ADD would always appear together. It seems that's a reasonable assumption as David mentioned we fuse them together. I could std::prev(MIT) and check if ADD/LDR is being followed by an ADRP but not yet sure how to allow outlining of both of them as a whole sequence while preventing them from outlining into separate function. Perhaps this function is not the right place to do this sort of thing and it needs to be done somewhere else. But then how does RISCV code that @lenary pointed out aims to fix this. That RISCV code also aims to prevent separation of LO/HI instructions. It appears that code is also going to prevent outlining of both HI/LO as a whole sequence and is buggy?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they will often be laid out next to one another, so if we need to check that the next instruction is a adrp/add/etc then it should be quick to check. They are not always guaranteed to be next to one another though, in case the check fails. In which case it is probably fine to bail out.

If there are no candidates that ever mark one of the adrp/add individually as outlinable, would it ever be possible to outline a single instruction from them? That might not be the most reliable solution though.

It appears that code is also going to prevent outlining of both HI/LO as a whole sequence and is buggy?

Maybe just a little inefficient? I'm not sure how much is lost from not outlining adrp+adds. If it is very little then marking them always invalid could be fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an alternative that scans the sequences looking for LDR or ADD instructions with MO_GOT that appear before an ADRP with a MO_PAGE.

You mean that appears after an ADRP with MO_PAGE?

I'm thinking of the case where there are candidates for outlining, like in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp#L9550 that are scanned for validity.

If we have in an outline candidate:
add x0, x0, :got_lo12: sym or ldr x0, [x0, :got_lo12:sym] but no preceeding adrp x0 :got: sym in the outline candidate, then we know that an adrp, add or adrp, ldr sequence has been split up as the add x0, x0, :got_lo12: sym and ldr x0, [x0, :got_lo12:sym] don't make any sense on their own.

I agree with @davemgreen that if not much is lost from outlining adrp (add/ldr) then it is fine to keep this simple. I don't have a representative set of programs to know that.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. The idea sounds OK to me. Would it be valid to outline so long as we always outlined both instructions (adrp+add, adrp+ldr) together? We fuse the adrp+add so they should generally be scheduled next to one another.

@smithp35
Copy link
Collaborator

Hi. The idea sounds OK to me. Would it be valid to outline so long as we always outlined both instructions (adrp+add, adrp+ldr) together? We fuse the adrp+add so they should generally be scheduled next to one another.

Yes, as long as the adrp+add, adrp+ldr are in the same section, we don't get the problem, as ICF will remove both or none of them. The comment in #129122 (comment) has a good summary of the chain of events that lead to the problem.

@lenary
Copy link
Member

lenary commented Oct 1, 2025

I have been wondering on the RISC-V side if we can add back our ADRP-like instruction to the outlined segment if we find we have only outlined the instruction with the lo operand. This would potentially add overhead to sequences where only the lo instruction has been outlined, and might cause redundant ADRP-likes, but I think it might allow more outlining? I think this should be doable in buildOutlinedFrame, but additional work would also have to be done in getOutliningCandidateInfo to correctly set the FrameOverhead.

I don't know how this approach would cope if the outliner tries to only outline the ADRP-like, rather than the lo instruction. You'd presumably want to replicate the ADRPs from inside the outlined part to immediately after the call to the outlined sequence? That sounds like it might be too much overhead, but the cost model might catch that.

@dtellenbach
Copy link
Member

dtellenbach commented Oct 4, 2025

I'm wondering if it would make sense to do this in getOutliningCandidateInfo and just remove the candidates that split up a adrp, add, ldr sequence. getOutliningTypeImpl is better suited for preventing individual instructions to be considered as part of candidates but doesn't seem a perfect fit if you want to allow the whole sequence to be outlined but not parts of it.

For detecting that a sequence has been split, you can do what @smithp35 mentioned earlier.

If we have in an outline candidate:
add x0, x0, :got_lo12: sym or ldr x0, [x0, :got_lo12:sym] but no preceeding adrp x0 :got: sym in the outline candidate, then we know that an adrp, add or adrp, ldr sequence has been split up as the add x0, x0, :got_lo12: sym and ldr x0, [x0, :got_lo12:sym] don't make any sense on their own.

@dtellenbach dtellenbach self-requested a review October 4, 2025 21:18
@pranavk
Copy link
Contributor Author

pranavk commented Oct 9, 2025

I'm wondering if it would make sense to do this in getOutliningCandidateInfo and just remove the candidates that split up a adrp, add, ldr sequence. getOutliningTypeImpl is better suited for preventing individual instructions to be considered as part of candidates but doesn't seem a perfect fit if you want to allow the whole sequence to be outlined but not parts of it.

getOutliningCandidateInfo certainly sounds like a better place to do what @smithp35 suggested above. This is under the assumption that LDR/ADD always follow ADRP which should generally happen as suggested above. @davemgreen I am curious under what conditions they won't be scheduled together. I'd like to avoid the correctness issue exposed by linker ICF. So it may be worth it to avoid this as well if possible.

Copy link

github-actions bot commented Oct 9, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -0,0 +1,133 @@
# RUN: llc -mtriple=aarch64--- -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
Copy link
Member

@dtellenbach dtellenbach Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for the whole adrp/add/ldr sequence?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants