Skip to content

[ELF] -r: Synthesize R_RISCV_ALIGN at input section start #151639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Aug 1, 2025

Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the .word 0x3a393837 data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented Aug 1, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Fangrui Song (MaskRay)

Changes

Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the .word 0x3a393837 data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236


Full diff: https://github.com/llvm/llvm-project/pull/151639.diff

6 Files Affected:

  • (modified) lld/ELF/Arch/RISCV.cpp (+122)
  • (modified) lld/ELF/LinkerScript.cpp (+11-1)
  • (modified) lld/ELF/OutputSections.cpp (+8-3)
  • (modified) lld/ELF/Target.h (+3)
  • (modified) lld/ELF/Writer.cpp (+2)
  • (added) lld/test/ELF/riscv-relocatable-align.s (+130)
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 72d83159ad8ac..cea6e31ee15dd 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -45,7 +45,18 @@ class RISCV final : public TargetInfo {
                 uint64_t val) const override;
   void relocateAlloc(InputSectionBase &sec, uint8_t *buf) const override;
   bool relaxOnce(int pass) const override;
+  template <class ELFT, class RelTy>
+  bool synthesizeAlignOne(uint64_t &dot, InputSection *sec, Relocs<RelTy> rels);
+  template <class ELFT, class RelTy>
+  void synthesizeAlignEnd(uint64_t &dot, InputSection *sec, Relocs<RelTy> rels);
+  template <class ELFT>
+  bool synthesizeAlignAux(uint64_t &dot, InputSection *sec);
+  bool maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) override;
   void finalizeRelax(int passes) const override;
+
+  // Used by synthesized ALIGN relocations.
+  InputSection *baseSec = nullptr;
+  SmallVector<std::pair<uint64_t, uint64_t>, 0> synthesizedAligns;
 };
 
 } // end anonymous namespace
@@ -959,10 +970,121 @@ bool RISCV::relaxOnce(int pass) const {
   return changed;
 }
 
+// If the section alignment is >= 4, advance `dot` to insert NOPs and synthesize
+// an ALIGN relocation. Otherwise, return false to use default handling.
+template <class ELFT, class RelTy>
+bool RISCV::synthesizeAlignOne(uint64_t &dot, InputSection *sec,
+                               Relocs<RelTy> rels) {
+  if (!baseSec) {
+    // Record the first section with RELAX relocations.
+    for (auto rel : rels) {
+      if (rel.getType(false) == R_RISCV_RELAX) {
+        baseSec = sec;
+        break;
+      }
+    }
+  } else if (sec->addralign >= 4) {
+    // If the alignment is >= 4 and the section does not start with an ALIGN
+    // relocation, synthesize one.
+    bool alignRel = false;
+    for (auto rel : rels)
+      if (rel.r_offset == 0 && rel.getType(false) == R_RISCV_ALIGN)
+        alignRel = true;
+    if (!alignRel) {
+      synthesizedAligns.emplace_back(dot - baseSec->getVA(),
+                                     sec->addralign - 2);
+      dot += sec->addralign - 2;
+      return true;
+    }
+  }
+  return false;
+}
+
+// Finalize the relocation section by appending synthesized ALIGN relocations
+// after processing all input sections.
+template <class ELFT, class RelTy>
+void RISCV::synthesizeAlignEnd(uint64_t &dot, InputSection *sec,
+                               Relocs<RelTy> rels) {
+  auto *f = cast<ObjFile<ELFT>>(baseSec->file);
+  auto shdr = f->template getELFShdrs<ELFT>()[baseSec->relSecIdx];
+  // Create a copy of InputSection.
+  sec = make<InputSection>(*f, shdr, baseSec->name);
+  auto *baseRelSec = cast<InputSection>(f->getSections()[baseSec->relSecIdx]);
+  *sec = *baseRelSec;
+  baseSec = nullptr;
+
+  // Allocate buffer for original and synthesized relocations in RELA format.
+  // If CREL is used, OutputSection::finalizeNonAllocCrel will convert RELA to
+  // CREL.
+  auto newSize = rels.size() + synthesizedAligns.size();
+  auto *relas = makeThreadLocalN<typename ELFT::Rela>(newSize);
+  sec->size = newSize * sizeof(typename ELFT::Rela);
+  sec->content_ = reinterpret_cast<uint8_t *>(relas);
+  sec->type = SHT_RELA;
+  // Copy original relocations to the new buffer, potentially converting CREL to
+  // RELA.
+  for (auto [i, r] : llvm::enumerate(rels)) {
+    relas[i].r_offset = r.r_offset;
+    relas[i].setSymbolAndType(r.getSymbol(0), r.getType(0), false);
+    if constexpr (RelTy::HasAddend)
+      relas[i].r_addend = r.r_addend;
+  }
+  // Append synthesized ALIGN relocations to the buffer.
+  for (auto [i, r] : llvm::enumerate(synthesizedAligns)) {
+    auto &rela = relas[rels.size() + i];
+    rela.r_offset = r.first;
+    rela.setSymbolAndType(0, R_RISCV_ALIGN, false);
+    rela.r_addend = r.second;
+  }
+  // Replace the old relocation section with the new one in the output section.
+  // addOrphanSections ensures that the output relocation section is processed
+  // after osec.
+  for (SectionCommand *cmd : sec->getParent()->commands) {
+    auto *isd = dyn_cast<InputSectionDescription>(cmd);
+    if (!isd)
+      continue;
+    for (auto *&isec : isd->sections)
+      if (isec == baseRelSec)
+        isec = sec;
+  }
+}
+
+template <class ELFT>
+bool RISCV::synthesizeAlignAux(uint64_t &dot, InputSection *sec) {
+  bool ret = false;
+  if (sec) {
+    invokeOnRelocs(*sec, ret = synthesizeAlignOne<ELFT>, dot, sec);
+  } else if (baseSec) {
+    invokeOnRelocs(*baseSec, synthesizeAlignEnd<ELFT>, dot, sec);
+  }
+  return ret;
+}
+
+// Without linker relaxation enabled for a particular relocatable file or
+// section, the assembler will not generate R_RISCV_ALIGN relocations for
+// alignment directives. This becomes problematic in a two-stage linking
+// process: ld -r a.o b.o -o ab.o; ld ab.o -o ab. This function synthesizes an
+// R_RISCV_ALIGN relocation at section start when needed.
+//
+// When called with an input section (`sec` is not null): If the section
+// alignment is >= 4, advance `dot` to insert NOPs and synthesize an ALIGN
+// relocation.
+//
+// When called after all input sections are processed (`sec` is null): The
+// output relocation section is updated with all the newly synthesized ALIGN
+// relocations.
+bool RISCV::maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) {
+  assert(ctx.arg.relocatable);
+  if (ctx.arg.is64)
+    return synthesizeAlignAux<ELF64LE>(dot, sec);
+  return synthesizeAlignAux<ELF32LE>(dot, sec);
+}
+
 void RISCV::finalizeRelax(int passes) const {
   llvm::TimeTraceScope timeScope("Finalize RISC-V relaxation");
   Log(ctx) << "relaxation passes: " << passes;
   SmallVector<InputSection *, 0> storage;
+
   for (OutputSection *osec : ctx.outputSections) {
     if (!(osec->flags & SHF_EXECINSTR))
       continue;
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index a5d08f4979dab..95830aacf45ce 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1230,6 +1230,9 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
   if (sec->firstInOverlay)
     state->overlaySize = 0;
 
+  bool synthesizeAlign =
+      (sec->flags & SHF_EXECINSTR) && ctx.arg.relocatable && ctx.arg.relax &&
+      is_contained({EM_RISCV, EM_LOONGARCH}, ctx.arg.emachine);
   // We visited SectionsCommands from processSectionCommands to
   // layout sections. Now, we visit SectionsCommands again to fix
   // section offsets.
@@ -1260,7 +1263,8 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
       if (isa<PotentialSpillSection>(isec))
         continue;
       const uint64_t pos = dot;
-      dot = alignToPowerOf2(dot, isec->addralign);
+      if (!(synthesizeAlign && ctx.target->maybeSynthesizeAlign(dot, isec)))
+        dot = alignToPowerOf2(dot, isec->addralign);
       isec->outSecOff = dot - sec->addr;
       dot += isec->getSize();
 
@@ -1276,6 +1280,12 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
   if (ctx.in.relroPadding && sec == ctx.in.relroPadding->getParent())
     expandOutputSection(alignToPowerOf2(dot, ctx.arg.commonPageSize) - dot);
 
+  if (synthesizeAlign) {
+    const uint64_t pos = dot;
+    ctx.target->maybeSynthesizeAlign(dot, nullptr);
+    expandOutputSection(dot - pos);
+  }
+
   // Non-SHF_ALLOC sections do not affect the addresses of other OutputSections
   // as they are not part of the process image.
   if (!(sec->flags & SHF_ALLOC)) {
diff --git a/lld/ELF/OutputSections.cpp b/lld/ELF/OutputSections.cpp
index 1020dd9f2569e..1ce4f2fd3b3f6 100644
--- a/lld/ELF/OutputSections.cpp
+++ b/lld/ELF/OutputSections.cpp
@@ -889,9 +889,14 @@ void OutputSection::sortInitFini() {
 std::array<uint8_t, 4> OutputSection::getFiller(Ctx &ctx) {
   if (filler)
     return *filler;
-  if (flags & SHF_EXECINSTR)
-    return ctx.target->trapInstr;
-  return {0, 0, 0, 0};
+  if (!(flags & SHF_EXECINSTR))
+    return {0, 0, 0, 0};
+  if (ctx.arg.relocatable && ctx.arg.emachine == EM_RISCV) {
+    if (ctx.arg.eflags & EF_RISCV_RVC)
+      return {1, 0, 1, 0};
+    return {0x13, 0, 0, 0};
+  }
+  return ctx.target->trapInstr;
 }
 
 void OutputSection::checkDynRelAddends(Ctx &ctx) {
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index fdc0c20f9cd02..1be2f04b5a726 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -96,6 +96,9 @@ class TargetInfo {
 
   // Do a linker relaxation pass and return true if we changed something.
   virtual bool relaxOnce(int pass) const { return false; }
+  virtual bool maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) {
+    return false;
+  }
   // Do finalize relaxation after collecting relaxation infos.
   virtual void finalizeRelax(int passes) const {}
 
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 2b0e097766d2c..fdacc54282c2c 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1543,6 +1543,8 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
 
   uint32_t pass = 0, assignPasses = 0;
   for (;;) {
+    if (ctx.arg.relocatable)
+      break;
     bool changed = ctx.target->needsThunks
                        ? tc.createThunks(pass, ctx.outputSections)
                        : ctx.target->relaxOnce(pass);
diff --git a/lld/test/ELF/riscv-relocatable-align.s b/lld/test/ELF/riscv-relocatable-align.s
new file mode 100644
index 0000000000000..9a782ed47850a
--- /dev/null
+++ b/lld/test/ELF/riscv-relocatable-align.s
@@ -0,0 +1,130 @@
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax a.s -o ac.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax b.s -o bc.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax b1.s -o b1c.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax c.s -o cc.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c d.s -o dc.o
+
+## No RELAX. Don't synthesize ALIGN.
+# RUN: ld.lld -r bc.o dc.o -o bd.ro
+# RUN: llvm-readelf -r bd.ro | FileCheck %s --check-prefix=NOREL
+
+# NOREL: no relocations
+
+# RUN: ld.lld -r bc.o bc.o ac.o bc.o b1c.o cc.o dc.o -o out.ro
+# RUN: llvm-objdump -dr -M no-aliases out.ro | FileCheck %s
+
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax b.s -o b.o
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax d.s -o d.o
+# RUN: ld.lld -r a.o b.o d.o -o out0.ro
+# RUN: ld.lld -Ttext=0x10000 out0.ro -o out0
+# RUN: llvm-objdump -dr -M no-aliases out0 | FileCheck %s --check-prefix=CHECK1
+
+# CHECK:      <b0>:
+# CHECK-NEXT:   0: 00158513             addi    a0, a1, 0x1
+# CHECK-NEXT:   4: 0001                 c.nop
+# CHECK-NEXT:   6: 0001                 c.nop
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:   8: 00158513             addi    a0, a1, 0x1
+# CHECK-EMPTY:
+# CHECK-NEXT: <_start>:
+# CHECK-NEXT:   c: 00000097             auipc   ra, 0x0
+# CHECK-NEXT:           000000000000000c:  R_RISCV_CALL_PLT     foo
+# CHECK-NEXT:           000000000000000c:  R_RISCV_RELAX        *ABS*
+# CHECK-NEXT:  10: 000080e7             jalr    ra, 0x0(ra) <_start>
+# CHECK-NEXT:  14: 0001                 c.nop
+# CHECK-NEXT:           0000000000000014:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK-NEXT:  16: 0001                 c.nop
+# CHECK-NEXT:  18: 0001                 c.nop
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:  1a: 00158513             addi    a0, a1, 0x1
+# CHECK-NEXT:  1e: 0001                 c.nop
+# CHECK-NEXT:  20: 0001                 c.nop
+# CHECK-NEXT:           0000000000000020:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK-NEXT:  22: 0001                 c.nop
+# CHECK-NEXT:  24: 00000013             addi    zero, zero, 0x0
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:  28: 00158513             addi    a0, a1, 0x1
+# CHECK-EMPTY:
+# CHECK-NEXT: <c0>:
+# CHECK-NEXT:  2c: 00000097             auipc   ra, 0x0
+# CHECK-NEXT:           000000000000002c:  R_RISCV_CALL_PLT     foo
+# CHECK-NEXT:           000000000000002c:  R_RISCV_RELAX        *ABS*
+# CHECK-NEXT:  30: 000080e7             jalr    ra, 0x0(ra) <c0>
+# CHECK-NEXT:  34: 0001                 c.nop
+# CHECK-NEXT:           0000000000000034:  R_RISCV_ALIGN        *ABS*+0x2
+# CHECK-EMPTY:
+# CHECK-NEXT: <d0>:
+# CHECK-NEXT:  36: 00258513             addi    a0, a1, 0x2
+
+# CHECK1:      <_start>:
+# CHECK1-NEXT:    010000ef      jal     ra, 0x10010 <foo>
+# CHECK1-NEXT:    00000013      addi zero, zero, 0x0
+# CHECK1-EMPTY:
+# CHECK1-NEXT: <b0>:
+# CHECK1-NEXT:    00158513      addi    a0, a1, 0x1
+# CHECK1-EMPTY:
+# CHECK1-NEXT: <d0>:
+# CHECK1-NEXT:    00258513      addi    a0, a1, 0x2
+
+## Test CREL.
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax --crel a.s -o acrel.o
+# RUN: ld.lld -r acrel.o bc.o -o out1.ro
+# RUN: llvm-objdump -dr -M no-aliases out1.ro | FileCheck %s --check-prefix=CHECK2
+
+# CHECK2:      <_start>:
+# CHECK2-NEXT:   0: 00000097             auipc   ra, 0x0
+# CHECK2-NEXT:           0000000000000000:  R_RISCV_CALL_PLT     foo
+# CHECK2-NEXT:           0000000000000000:  R_RISCV_RELAX        *ABS*
+# CHECK2-NEXT:   4: 000080e7             jalr    ra, 0x0(ra) <_start>
+# CHECK2-NEXT:   8: 0001                 c.nop
+# CHECK2-NEXT:           0000000000000008:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK2-NEXT:   a: 0001                 c.nop
+# CHECK2-NEXT:   c: 0001                 c.nop
+# CHECK2-EMPTY:
+# CHECK2-NEXT: <b0>:
+# CHECK2-NEXT:   e: 00158513             addi    a0, a1, 0x1
+
+#--- a.s
+.globl _start
+_start:
+  call foo
+
+.section .text1,"ax"
+.globl foo
+foo:
+
+#--- b.s
+## Needs synthesized ALIGN
+.option push
+.option norelax
+.balign 8
+b0:
+  addi a0, a1, 1
+.option pop
+
+#--- b1.s
+.option push
+.option norelax
+  .reloc ., R_RISCV_ALIGN, 6
+  addi x0, x0, 0
+  c.nop
+.balign 8
+b0:
+  addi a0, a1, 1
+.option pop
+
+#--- c.s
+.balign 2
+c0:
+  call foo
+
+#--- d.s
+## Needs synthesized ALIGN
+.balign 4
+d0:
+  addi a0, a1, 2

@llvmbot
Copy link
Member

llvmbot commented Aug 1, 2025

@llvm/pr-subscribers-lld

Author: Fangrui Song (MaskRay)

Changes

Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the .word 0x3a393837 data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236


Full diff: https://github.com/llvm/llvm-project/pull/151639.diff

6 Files Affected:

  • (modified) lld/ELF/Arch/RISCV.cpp (+122)
  • (modified) lld/ELF/LinkerScript.cpp (+11-1)
  • (modified) lld/ELF/OutputSections.cpp (+8-3)
  • (modified) lld/ELF/Target.h (+3)
  • (modified) lld/ELF/Writer.cpp (+2)
  • (added) lld/test/ELF/riscv-relocatable-align.s (+130)
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 72d83159ad8ac..cea6e31ee15dd 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -45,7 +45,18 @@ class RISCV final : public TargetInfo {
                 uint64_t val) const override;
   void relocateAlloc(InputSectionBase &sec, uint8_t *buf) const override;
   bool relaxOnce(int pass) const override;
+  template <class ELFT, class RelTy>
+  bool synthesizeAlignOne(uint64_t &dot, InputSection *sec, Relocs<RelTy> rels);
+  template <class ELFT, class RelTy>
+  void synthesizeAlignEnd(uint64_t &dot, InputSection *sec, Relocs<RelTy> rels);
+  template <class ELFT>
+  bool synthesizeAlignAux(uint64_t &dot, InputSection *sec);
+  bool maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) override;
   void finalizeRelax(int passes) const override;
+
+  // Used by synthesized ALIGN relocations.
+  InputSection *baseSec = nullptr;
+  SmallVector<std::pair<uint64_t, uint64_t>, 0> synthesizedAligns;
 };
 
 } // end anonymous namespace
@@ -959,10 +970,121 @@ bool RISCV::relaxOnce(int pass) const {
   return changed;
 }
 
+// If the section alignment is >= 4, advance `dot` to insert NOPs and synthesize
+// an ALIGN relocation. Otherwise, return false to use default handling.
+template <class ELFT, class RelTy>
+bool RISCV::synthesizeAlignOne(uint64_t &dot, InputSection *sec,
+                               Relocs<RelTy> rels) {
+  if (!baseSec) {
+    // Record the first section with RELAX relocations.
+    for (auto rel : rels) {
+      if (rel.getType(false) == R_RISCV_RELAX) {
+        baseSec = sec;
+        break;
+      }
+    }
+  } else if (sec->addralign >= 4) {
+    // If the alignment is >= 4 and the section does not start with an ALIGN
+    // relocation, synthesize one.
+    bool alignRel = false;
+    for (auto rel : rels)
+      if (rel.r_offset == 0 && rel.getType(false) == R_RISCV_ALIGN)
+        alignRel = true;
+    if (!alignRel) {
+      synthesizedAligns.emplace_back(dot - baseSec->getVA(),
+                                     sec->addralign - 2);
+      dot += sec->addralign - 2;
+      return true;
+    }
+  }
+  return false;
+}
+
+// Finalize the relocation section by appending synthesized ALIGN relocations
+// after processing all input sections.
+template <class ELFT, class RelTy>
+void RISCV::synthesizeAlignEnd(uint64_t &dot, InputSection *sec,
+                               Relocs<RelTy> rels) {
+  auto *f = cast<ObjFile<ELFT>>(baseSec->file);
+  auto shdr = f->template getELFShdrs<ELFT>()[baseSec->relSecIdx];
+  // Create a copy of InputSection.
+  sec = make<InputSection>(*f, shdr, baseSec->name);
+  auto *baseRelSec = cast<InputSection>(f->getSections()[baseSec->relSecIdx]);
+  *sec = *baseRelSec;
+  baseSec = nullptr;
+
+  // Allocate buffer for original and synthesized relocations in RELA format.
+  // If CREL is used, OutputSection::finalizeNonAllocCrel will convert RELA to
+  // CREL.
+  auto newSize = rels.size() + synthesizedAligns.size();
+  auto *relas = makeThreadLocalN<typename ELFT::Rela>(newSize);
+  sec->size = newSize * sizeof(typename ELFT::Rela);
+  sec->content_ = reinterpret_cast<uint8_t *>(relas);
+  sec->type = SHT_RELA;
+  // Copy original relocations to the new buffer, potentially converting CREL to
+  // RELA.
+  for (auto [i, r] : llvm::enumerate(rels)) {
+    relas[i].r_offset = r.r_offset;
+    relas[i].setSymbolAndType(r.getSymbol(0), r.getType(0), false);
+    if constexpr (RelTy::HasAddend)
+      relas[i].r_addend = r.r_addend;
+  }
+  // Append synthesized ALIGN relocations to the buffer.
+  for (auto [i, r] : llvm::enumerate(synthesizedAligns)) {
+    auto &rela = relas[rels.size() + i];
+    rela.r_offset = r.first;
+    rela.setSymbolAndType(0, R_RISCV_ALIGN, false);
+    rela.r_addend = r.second;
+  }
+  // Replace the old relocation section with the new one in the output section.
+  // addOrphanSections ensures that the output relocation section is processed
+  // after osec.
+  for (SectionCommand *cmd : sec->getParent()->commands) {
+    auto *isd = dyn_cast<InputSectionDescription>(cmd);
+    if (!isd)
+      continue;
+    for (auto *&isec : isd->sections)
+      if (isec == baseRelSec)
+        isec = sec;
+  }
+}
+
+template <class ELFT>
+bool RISCV::synthesizeAlignAux(uint64_t &dot, InputSection *sec) {
+  bool ret = false;
+  if (sec) {
+    invokeOnRelocs(*sec, ret = synthesizeAlignOne<ELFT>, dot, sec);
+  } else if (baseSec) {
+    invokeOnRelocs(*baseSec, synthesizeAlignEnd<ELFT>, dot, sec);
+  }
+  return ret;
+}
+
+// Without linker relaxation enabled for a particular relocatable file or
+// section, the assembler will not generate R_RISCV_ALIGN relocations for
+// alignment directives. This becomes problematic in a two-stage linking
+// process: ld -r a.o b.o -o ab.o; ld ab.o -o ab. This function synthesizes an
+// R_RISCV_ALIGN relocation at section start when needed.
+//
+// When called with an input section (`sec` is not null): If the section
+// alignment is >= 4, advance `dot` to insert NOPs and synthesize an ALIGN
+// relocation.
+//
+// When called after all input sections are processed (`sec` is null): The
+// output relocation section is updated with all the newly synthesized ALIGN
+// relocations.
+bool RISCV::maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) {
+  assert(ctx.arg.relocatable);
+  if (ctx.arg.is64)
+    return synthesizeAlignAux<ELF64LE>(dot, sec);
+  return synthesizeAlignAux<ELF32LE>(dot, sec);
+}
+
 void RISCV::finalizeRelax(int passes) const {
   llvm::TimeTraceScope timeScope("Finalize RISC-V relaxation");
   Log(ctx) << "relaxation passes: " << passes;
   SmallVector<InputSection *, 0> storage;
+
   for (OutputSection *osec : ctx.outputSections) {
     if (!(osec->flags & SHF_EXECINSTR))
       continue;
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index a5d08f4979dab..95830aacf45ce 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1230,6 +1230,9 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
   if (sec->firstInOverlay)
     state->overlaySize = 0;
 
+  bool synthesizeAlign =
+      (sec->flags & SHF_EXECINSTR) && ctx.arg.relocatable && ctx.arg.relax &&
+      is_contained({EM_RISCV, EM_LOONGARCH}, ctx.arg.emachine);
   // We visited SectionsCommands from processSectionCommands to
   // layout sections. Now, we visit SectionsCommands again to fix
   // section offsets.
@@ -1260,7 +1263,8 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
       if (isa<PotentialSpillSection>(isec))
         continue;
       const uint64_t pos = dot;
-      dot = alignToPowerOf2(dot, isec->addralign);
+      if (!(synthesizeAlign && ctx.target->maybeSynthesizeAlign(dot, isec)))
+        dot = alignToPowerOf2(dot, isec->addralign);
       isec->outSecOff = dot - sec->addr;
       dot += isec->getSize();
 
@@ -1276,6 +1280,12 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
   if (ctx.in.relroPadding && sec == ctx.in.relroPadding->getParent())
     expandOutputSection(alignToPowerOf2(dot, ctx.arg.commonPageSize) - dot);
 
+  if (synthesizeAlign) {
+    const uint64_t pos = dot;
+    ctx.target->maybeSynthesizeAlign(dot, nullptr);
+    expandOutputSection(dot - pos);
+  }
+
   // Non-SHF_ALLOC sections do not affect the addresses of other OutputSections
   // as they are not part of the process image.
   if (!(sec->flags & SHF_ALLOC)) {
diff --git a/lld/ELF/OutputSections.cpp b/lld/ELF/OutputSections.cpp
index 1020dd9f2569e..1ce4f2fd3b3f6 100644
--- a/lld/ELF/OutputSections.cpp
+++ b/lld/ELF/OutputSections.cpp
@@ -889,9 +889,14 @@ void OutputSection::sortInitFini() {
 std::array<uint8_t, 4> OutputSection::getFiller(Ctx &ctx) {
   if (filler)
     return *filler;
-  if (flags & SHF_EXECINSTR)
-    return ctx.target->trapInstr;
-  return {0, 0, 0, 0};
+  if (!(flags & SHF_EXECINSTR))
+    return {0, 0, 0, 0};
+  if (ctx.arg.relocatable && ctx.arg.emachine == EM_RISCV) {
+    if (ctx.arg.eflags & EF_RISCV_RVC)
+      return {1, 0, 1, 0};
+    return {0x13, 0, 0, 0};
+  }
+  return ctx.target->trapInstr;
 }
 
 void OutputSection::checkDynRelAddends(Ctx &ctx) {
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index fdc0c20f9cd02..1be2f04b5a726 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -96,6 +96,9 @@ class TargetInfo {
 
   // Do a linker relaxation pass and return true if we changed something.
   virtual bool relaxOnce(int pass) const { return false; }
+  virtual bool maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) {
+    return false;
+  }
   // Do finalize relaxation after collecting relaxation infos.
   virtual void finalizeRelax(int passes) const {}
 
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 2b0e097766d2c..fdacc54282c2c 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1543,6 +1543,8 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
 
   uint32_t pass = 0, assignPasses = 0;
   for (;;) {
+    if (ctx.arg.relocatable)
+      break;
     bool changed = ctx.target->needsThunks
                        ? tc.createThunks(pass, ctx.outputSections)
                        : ctx.target->relaxOnce(pass);
diff --git a/lld/test/ELF/riscv-relocatable-align.s b/lld/test/ELF/riscv-relocatable-align.s
new file mode 100644
index 0000000000000..9a782ed47850a
--- /dev/null
+++ b/lld/test/ELF/riscv-relocatable-align.s
@@ -0,0 +1,130 @@
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax a.s -o ac.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax b.s -o bc.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax b1.s -o b1c.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax c.s -o cc.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c d.s -o dc.o
+
+## No RELAX. Don't synthesize ALIGN.
+# RUN: ld.lld -r bc.o dc.o -o bd.ro
+# RUN: llvm-readelf -r bd.ro | FileCheck %s --check-prefix=NOREL
+
+# NOREL: no relocations
+
+# RUN: ld.lld -r bc.o bc.o ac.o bc.o b1c.o cc.o dc.o -o out.ro
+# RUN: llvm-objdump -dr -M no-aliases out.ro | FileCheck %s
+
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax b.s -o b.o
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax d.s -o d.o
+# RUN: ld.lld -r a.o b.o d.o -o out0.ro
+# RUN: ld.lld -Ttext=0x10000 out0.ro -o out0
+# RUN: llvm-objdump -dr -M no-aliases out0 | FileCheck %s --check-prefix=CHECK1
+
+# CHECK:      <b0>:
+# CHECK-NEXT:   0: 00158513             addi    a0, a1, 0x1
+# CHECK-NEXT:   4: 0001                 c.nop
+# CHECK-NEXT:   6: 0001                 c.nop
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:   8: 00158513             addi    a0, a1, 0x1
+# CHECK-EMPTY:
+# CHECK-NEXT: <_start>:
+# CHECK-NEXT:   c: 00000097             auipc   ra, 0x0
+# CHECK-NEXT:           000000000000000c:  R_RISCV_CALL_PLT     foo
+# CHECK-NEXT:           000000000000000c:  R_RISCV_RELAX        *ABS*
+# CHECK-NEXT:  10: 000080e7             jalr    ra, 0x0(ra) <_start>
+# CHECK-NEXT:  14: 0001                 c.nop
+# CHECK-NEXT:           0000000000000014:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK-NEXT:  16: 0001                 c.nop
+# CHECK-NEXT:  18: 0001                 c.nop
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:  1a: 00158513             addi    a0, a1, 0x1
+# CHECK-NEXT:  1e: 0001                 c.nop
+# CHECK-NEXT:  20: 0001                 c.nop
+# CHECK-NEXT:           0000000000000020:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK-NEXT:  22: 0001                 c.nop
+# CHECK-NEXT:  24: 00000013             addi    zero, zero, 0x0
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:  28: 00158513             addi    a0, a1, 0x1
+# CHECK-EMPTY:
+# CHECK-NEXT: <c0>:
+# CHECK-NEXT:  2c: 00000097             auipc   ra, 0x0
+# CHECK-NEXT:           000000000000002c:  R_RISCV_CALL_PLT     foo
+# CHECK-NEXT:           000000000000002c:  R_RISCV_RELAX        *ABS*
+# CHECK-NEXT:  30: 000080e7             jalr    ra, 0x0(ra) <c0>
+# CHECK-NEXT:  34: 0001                 c.nop
+# CHECK-NEXT:           0000000000000034:  R_RISCV_ALIGN        *ABS*+0x2
+# CHECK-EMPTY:
+# CHECK-NEXT: <d0>:
+# CHECK-NEXT:  36: 00258513             addi    a0, a1, 0x2
+
+# CHECK1:      <_start>:
+# CHECK1-NEXT:    010000ef      jal     ra, 0x10010 <foo>
+# CHECK1-NEXT:    00000013      addi zero, zero, 0x0
+# CHECK1-EMPTY:
+# CHECK1-NEXT: <b0>:
+# CHECK1-NEXT:    00158513      addi    a0, a1, 0x1
+# CHECK1-EMPTY:
+# CHECK1-NEXT: <d0>:
+# CHECK1-NEXT:    00258513      addi    a0, a1, 0x2
+
+## Test CREL.
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax --crel a.s -o acrel.o
+# RUN: ld.lld -r acrel.o bc.o -o out1.ro
+# RUN: llvm-objdump -dr -M no-aliases out1.ro | FileCheck %s --check-prefix=CHECK2
+
+# CHECK2:      <_start>:
+# CHECK2-NEXT:   0: 00000097             auipc   ra, 0x0
+# CHECK2-NEXT:           0000000000000000:  R_RISCV_CALL_PLT     foo
+# CHECK2-NEXT:           0000000000000000:  R_RISCV_RELAX        *ABS*
+# CHECK2-NEXT:   4: 000080e7             jalr    ra, 0x0(ra) <_start>
+# CHECK2-NEXT:   8: 0001                 c.nop
+# CHECK2-NEXT:           0000000000000008:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK2-NEXT:   a: 0001                 c.nop
+# CHECK2-NEXT:   c: 0001                 c.nop
+# CHECK2-EMPTY:
+# CHECK2-NEXT: <b0>:
+# CHECK2-NEXT:   e: 00158513             addi    a0, a1, 0x1
+
+#--- a.s
+.globl _start
+_start:
+  call foo
+
+.section .text1,"ax"
+.globl foo
+foo:
+
+#--- b.s
+## Needs synthesized ALIGN
+.option push
+.option norelax
+.balign 8
+b0:
+  addi a0, a1, 1
+.option pop
+
+#--- b1.s
+.option push
+.option norelax
+  .reloc ., R_RISCV_ALIGN, 6
+  addi x0, x0, 0
+  c.nop
+.balign 8
+b0:
+  addi a0, a1, 1
+.option pop
+
+#--- c.s
+.balign 2
+c0:
+  call foo
+
+#--- d.s
+## Needs synthesized ALIGN
+.balign 4
+d0:
+  addi a0, a1, 2

@llvmbot
Copy link
Member

llvmbot commented Aug 1, 2025

@llvm/pr-subscribers-lld-elf

Author: Fangrui Song (MaskRay)

Changes

Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the .word 0x3a393837 data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236


Full diff: https://github.com/llvm/llvm-project/pull/151639.diff

6 Files Affected:

  • (modified) lld/ELF/Arch/RISCV.cpp (+122)
  • (modified) lld/ELF/LinkerScript.cpp (+11-1)
  • (modified) lld/ELF/OutputSections.cpp (+8-3)
  • (modified) lld/ELF/Target.h (+3)
  • (modified) lld/ELF/Writer.cpp (+2)
  • (added) lld/test/ELF/riscv-relocatable-align.s (+130)
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 72d83159ad8ac..cea6e31ee15dd 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -45,7 +45,18 @@ class RISCV final : public TargetInfo {
                 uint64_t val) const override;
   void relocateAlloc(InputSectionBase &sec, uint8_t *buf) const override;
   bool relaxOnce(int pass) const override;
+  template <class ELFT, class RelTy>
+  bool synthesizeAlignOne(uint64_t &dot, InputSection *sec, Relocs<RelTy> rels);
+  template <class ELFT, class RelTy>
+  void synthesizeAlignEnd(uint64_t &dot, InputSection *sec, Relocs<RelTy> rels);
+  template <class ELFT>
+  bool synthesizeAlignAux(uint64_t &dot, InputSection *sec);
+  bool maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) override;
   void finalizeRelax(int passes) const override;
+
+  // Used by synthesized ALIGN relocations.
+  InputSection *baseSec = nullptr;
+  SmallVector<std::pair<uint64_t, uint64_t>, 0> synthesizedAligns;
 };
 
 } // end anonymous namespace
@@ -959,10 +970,121 @@ bool RISCV::relaxOnce(int pass) const {
   return changed;
 }
 
+// If the section alignment is >= 4, advance `dot` to insert NOPs and synthesize
+// an ALIGN relocation. Otherwise, return false to use default handling.
+template <class ELFT, class RelTy>
+bool RISCV::synthesizeAlignOne(uint64_t &dot, InputSection *sec,
+                               Relocs<RelTy> rels) {
+  if (!baseSec) {
+    // Record the first section with RELAX relocations.
+    for (auto rel : rels) {
+      if (rel.getType(false) == R_RISCV_RELAX) {
+        baseSec = sec;
+        break;
+      }
+    }
+  } else if (sec->addralign >= 4) {
+    // If the alignment is >= 4 and the section does not start with an ALIGN
+    // relocation, synthesize one.
+    bool alignRel = false;
+    for (auto rel : rels)
+      if (rel.r_offset == 0 && rel.getType(false) == R_RISCV_ALIGN)
+        alignRel = true;
+    if (!alignRel) {
+      synthesizedAligns.emplace_back(dot - baseSec->getVA(),
+                                     sec->addralign - 2);
+      dot += sec->addralign - 2;
+      return true;
+    }
+  }
+  return false;
+}
+
+// Finalize the relocation section by appending synthesized ALIGN relocations
+// after processing all input sections.
+template <class ELFT, class RelTy>
+void RISCV::synthesizeAlignEnd(uint64_t &dot, InputSection *sec,
+                               Relocs<RelTy> rels) {
+  auto *f = cast<ObjFile<ELFT>>(baseSec->file);
+  auto shdr = f->template getELFShdrs<ELFT>()[baseSec->relSecIdx];
+  // Create a copy of InputSection.
+  sec = make<InputSection>(*f, shdr, baseSec->name);
+  auto *baseRelSec = cast<InputSection>(f->getSections()[baseSec->relSecIdx]);
+  *sec = *baseRelSec;
+  baseSec = nullptr;
+
+  // Allocate buffer for original and synthesized relocations in RELA format.
+  // If CREL is used, OutputSection::finalizeNonAllocCrel will convert RELA to
+  // CREL.
+  auto newSize = rels.size() + synthesizedAligns.size();
+  auto *relas = makeThreadLocalN<typename ELFT::Rela>(newSize);
+  sec->size = newSize * sizeof(typename ELFT::Rela);
+  sec->content_ = reinterpret_cast<uint8_t *>(relas);
+  sec->type = SHT_RELA;
+  // Copy original relocations to the new buffer, potentially converting CREL to
+  // RELA.
+  for (auto [i, r] : llvm::enumerate(rels)) {
+    relas[i].r_offset = r.r_offset;
+    relas[i].setSymbolAndType(r.getSymbol(0), r.getType(0), false);
+    if constexpr (RelTy::HasAddend)
+      relas[i].r_addend = r.r_addend;
+  }
+  // Append synthesized ALIGN relocations to the buffer.
+  for (auto [i, r] : llvm::enumerate(synthesizedAligns)) {
+    auto &rela = relas[rels.size() + i];
+    rela.r_offset = r.first;
+    rela.setSymbolAndType(0, R_RISCV_ALIGN, false);
+    rela.r_addend = r.second;
+  }
+  // Replace the old relocation section with the new one in the output section.
+  // addOrphanSections ensures that the output relocation section is processed
+  // after osec.
+  for (SectionCommand *cmd : sec->getParent()->commands) {
+    auto *isd = dyn_cast<InputSectionDescription>(cmd);
+    if (!isd)
+      continue;
+    for (auto *&isec : isd->sections)
+      if (isec == baseRelSec)
+        isec = sec;
+  }
+}
+
+template <class ELFT>
+bool RISCV::synthesizeAlignAux(uint64_t &dot, InputSection *sec) {
+  bool ret = false;
+  if (sec) {
+    invokeOnRelocs(*sec, ret = synthesizeAlignOne<ELFT>, dot, sec);
+  } else if (baseSec) {
+    invokeOnRelocs(*baseSec, synthesizeAlignEnd<ELFT>, dot, sec);
+  }
+  return ret;
+}
+
+// Without linker relaxation enabled for a particular relocatable file or
+// section, the assembler will not generate R_RISCV_ALIGN relocations for
+// alignment directives. This becomes problematic in a two-stage linking
+// process: ld -r a.o b.o -o ab.o; ld ab.o -o ab. This function synthesizes an
+// R_RISCV_ALIGN relocation at section start when needed.
+//
+// When called with an input section (`sec` is not null): If the section
+// alignment is >= 4, advance `dot` to insert NOPs and synthesize an ALIGN
+// relocation.
+//
+// When called after all input sections are processed (`sec` is null): The
+// output relocation section is updated with all the newly synthesized ALIGN
+// relocations.
+bool RISCV::maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) {
+  assert(ctx.arg.relocatable);
+  if (ctx.arg.is64)
+    return synthesizeAlignAux<ELF64LE>(dot, sec);
+  return synthesizeAlignAux<ELF32LE>(dot, sec);
+}
+
 void RISCV::finalizeRelax(int passes) const {
   llvm::TimeTraceScope timeScope("Finalize RISC-V relaxation");
   Log(ctx) << "relaxation passes: " << passes;
   SmallVector<InputSection *, 0> storage;
+
   for (OutputSection *osec : ctx.outputSections) {
     if (!(osec->flags & SHF_EXECINSTR))
       continue;
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index a5d08f4979dab..95830aacf45ce 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1230,6 +1230,9 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
   if (sec->firstInOverlay)
     state->overlaySize = 0;
 
+  bool synthesizeAlign =
+      (sec->flags & SHF_EXECINSTR) && ctx.arg.relocatable && ctx.arg.relax &&
+      is_contained({EM_RISCV, EM_LOONGARCH}, ctx.arg.emachine);
   // We visited SectionsCommands from processSectionCommands to
   // layout sections. Now, we visit SectionsCommands again to fix
   // section offsets.
@@ -1260,7 +1263,8 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
       if (isa<PotentialSpillSection>(isec))
         continue;
       const uint64_t pos = dot;
-      dot = alignToPowerOf2(dot, isec->addralign);
+      if (!(synthesizeAlign && ctx.target->maybeSynthesizeAlign(dot, isec)))
+        dot = alignToPowerOf2(dot, isec->addralign);
       isec->outSecOff = dot - sec->addr;
       dot += isec->getSize();
 
@@ -1276,6 +1280,12 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
   if (ctx.in.relroPadding && sec == ctx.in.relroPadding->getParent())
     expandOutputSection(alignToPowerOf2(dot, ctx.arg.commonPageSize) - dot);
 
+  if (synthesizeAlign) {
+    const uint64_t pos = dot;
+    ctx.target->maybeSynthesizeAlign(dot, nullptr);
+    expandOutputSection(dot - pos);
+  }
+
   // Non-SHF_ALLOC sections do not affect the addresses of other OutputSections
   // as they are not part of the process image.
   if (!(sec->flags & SHF_ALLOC)) {
diff --git a/lld/ELF/OutputSections.cpp b/lld/ELF/OutputSections.cpp
index 1020dd9f2569e..1ce4f2fd3b3f6 100644
--- a/lld/ELF/OutputSections.cpp
+++ b/lld/ELF/OutputSections.cpp
@@ -889,9 +889,14 @@ void OutputSection::sortInitFini() {
 std::array<uint8_t, 4> OutputSection::getFiller(Ctx &ctx) {
   if (filler)
     return *filler;
-  if (flags & SHF_EXECINSTR)
-    return ctx.target->trapInstr;
-  return {0, 0, 0, 0};
+  if (!(flags & SHF_EXECINSTR))
+    return {0, 0, 0, 0};
+  if (ctx.arg.relocatable && ctx.arg.emachine == EM_RISCV) {
+    if (ctx.arg.eflags & EF_RISCV_RVC)
+      return {1, 0, 1, 0};
+    return {0x13, 0, 0, 0};
+  }
+  return ctx.target->trapInstr;
 }
 
 void OutputSection::checkDynRelAddends(Ctx &ctx) {
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index fdc0c20f9cd02..1be2f04b5a726 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -96,6 +96,9 @@ class TargetInfo {
 
   // Do a linker relaxation pass and return true if we changed something.
   virtual bool relaxOnce(int pass) const { return false; }
+  virtual bool maybeSynthesizeAlign(uint64_t &dot, InputSection *sec) {
+    return false;
+  }
   // Do finalize relaxation after collecting relaxation infos.
   virtual void finalizeRelax(int passes) const {}
 
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 2b0e097766d2c..fdacc54282c2c 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1543,6 +1543,8 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
 
   uint32_t pass = 0, assignPasses = 0;
   for (;;) {
+    if (ctx.arg.relocatable)
+      break;
     bool changed = ctx.target->needsThunks
                        ? tc.createThunks(pass, ctx.outputSections)
                        : ctx.target->relaxOnce(pass);
diff --git a/lld/test/ELF/riscv-relocatable-align.s b/lld/test/ELF/riscv-relocatable-align.s
new file mode 100644
index 0000000000000..9a782ed47850a
--- /dev/null
+++ b/lld/test/ELF/riscv-relocatable-align.s
@@ -0,0 +1,130 @@
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax a.s -o ac.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax b.s -o bc.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax b1.s -o b1c.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax c.s -o cc.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c d.s -o dc.o
+
+## No RELAX. Don't synthesize ALIGN.
+# RUN: ld.lld -r bc.o dc.o -o bd.ro
+# RUN: llvm-readelf -r bd.ro | FileCheck %s --check-prefix=NOREL
+
+# NOREL: no relocations
+
+# RUN: ld.lld -r bc.o bc.o ac.o bc.o b1c.o cc.o dc.o -o out.ro
+# RUN: llvm-objdump -dr -M no-aliases out.ro | FileCheck %s
+
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax b.s -o b.o
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax d.s -o d.o
+# RUN: ld.lld -r a.o b.o d.o -o out0.ro
+# RUN: ld.lld -Ttext=0x10000 out0.ro -o out0
+# RUN: llvm-objdump -dr -M no-aliases out0 | FileCheck %s --check-prefix=CHECK1
+
+# CHECK:      <b0>:
+# CHECK-NEXT:   0: 00158513             addi    a0, a1, 0x1
+# CHECK-NEXT:   4: 0001                 c.nop
+# CHECK-NEXT:   6: 0001                 c.nop
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:   8: 00158513             addi    a0, a1, 0x1
+# CHECK-EMPTY:
+# CHECK-NEXT: <_start>:
+# CHECK-NEXT:   c: 00000097             auipc   ra, 0x0
+# CHECK-NEXT:           000000000000000c:  R_RISCV_CALL_PLT     foo
+# CHECK-NEXT:           000000000000000c:  R_RISCV_RELAX        *ABS*
+# CHECK-NEXT:  10: 000080e7             jalr    ra, 0x0(ra) <_start>
+# CHECK-NEXT:  14: 0001                 c.nop
+# CHECK-NEXT:           0000000000000014:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK-NEXT:  16: 0001                 c.nop
+# CHECK-NEXT:  18: 0001                 c.nop
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:  1a: 00158513             addi    a0, a1, 0x1
+# CHECK-NEXT:  1e: 0001                 c.nop
+# CHECK-NEXT:  20: 0001                 c.nop
+# CHECK-NEXT:           0000000000000020:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK-NEXT:  22: 0001                 c.nop
+# CHECK-NEXT:  24: 00000013             addi    zero, zero, 0x0
+# CHECK-EMPTY:
+# CHECK-NEXT: <b0>:
+# CHECK-NEXT:  28: 00158513             addi    a0, a1, 0x1
+# CHECK-EMPTY:
+# CHECK-NEXT: <c0>:
+# CHECK-NEXT:  2c: 00000097             auipc   ra, 0x0
+# CHECK-NEXT:           000000000000002c:  R_RISCV_CALL_PLT     foo
+# CHECK-NEXT:           000000000000002c:  R_RISCV_RELAX        *ABS*
+# CHECK-NEXT:  30: 000080e7             jalr    ra, 0x0(ra) <c0>
+# CHECK-NEXT:  34: 0001                 c.nop
+# CHECK-NEXT:           0000000000000034:  R_RISCV_ALIGN        *ABS*+0x2
+# CHECK-EMPTY:
+# CHECK-NEXT: <d0>:
+# CHECK-NEXT:  36: 00258513             addi    a0, a1, 0x2
+
+# CHECK1:      <_start>:
+# CHECK1-NEXT:    010000ef      jal     ra, 0x10010 <foo>
+# CHECK1-NEXT:    00000013      addi zero, zero, 0x0
+# CHECK1-EMPTY:
+# CHECK1-NEXT: <b0>:
+# CHECK1-NEXT:    00158513      addi    a0, a1, 0x1
+# CHECK1-EMPTY:
+# CHECK1-NEXT: <d0>:
+# CHECK1-NEXT:    00258513      addi    a0, a1, 0x2
+
+## Test CREL.
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+c,+relax --crel a.s -o acrel.o
+# RUN: ld.lld -r acrel.o bc.o -o out1.ro
+# RUN: llvm-objdump -dr -M no-aliases out1.ro | FileCheck %s --check-prefix=CHECK2
+
+# CHECK2:      <_start>:
+# CHECK2-NEXT:   0: 00000097             auipc   ra, 0x0
+# CHECK2-NEXT:           0000000000000000:  R_RISCV_CALL_PLT     foo
+# CHECK2-NEXT:           0000000000000000:  R_RISCV_RELAX        *ABS*
+# CHECK2-NEXT:   4: 000080e7             jalr    ra, 0x0(ra) <_start>
+# CHECK2-NEXT:   8: 0001                 c.nop
+# CHECK2-NEXT:           0000000000000008:  R_RISCV_ALIGN        *ABS*+0x6
+# CHECK2-NEXT:   a: 0001                 c.nop
+# CHECK2-NEXT:   c: 0001                 c.nop
+# CHECK2-EMPTY:
+# CHECK2-NEXT: <b0>:
+# CHECK2-NEXT:   e: 00158513             addi    a0, a1, 0x1
+
+#--- a.s
+.globl _start
+_start:
+  call foo
+
+.section .text1,"ax"
+.globl foo
+foo:
+
+#--- b.s
+## Needs synthesized ALIGN
+.option push
+.option norelax
+.balign 8
+b0:
+  addi a0, a1, 1
+.option pop
+
+#--- b1.s
+.option push
+.option norelax
+  .reloc ., R_RISCV_ALIGN, 6
+  addi x0, x0, 0
+  c.nop
+.balign 8
+b0:
+  addi a0, a1, 1
+.option pop
+
+#--- c.s
+.balign 2
+c0:
+  call foo
+
+#--- d.s
+## Needs synthesized ALIGN
+.balign 4
+d0:
+  addi a0, a1, 2

@MaskRay MaskRay changed the title ELF: Synthesize R_RISCV_ALIGN at input section start [ELF] -r: Synthesize R_RISCV_ALIGN at input section start Aug 1, 2025
@MaskRay
Copy link
Member Author

MaskRay commented Aug 1, 2025

.
Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented Aug 1, 2025

@Nelson1225

if constexpr (RelTy::HasAddend)
relas[i].r_addend = r.r_addend;
}
// Append synthesized ALIGN relocations to the buffer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the R_RISCV_ALIGN need to be inserted at the "right" place for their offset? i.e., so that all R_RISCV_ALIGN come in address-ascending order. I know we cannot fully sort the relocations (iirc, we tried this before).

I guess we're not doing this, and we wouldn't expect the output of lld -r to be consumed by a different linker, so your tests are enough to show things are working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. LLD sorts relocations by offset, so later ALIGN relocations are fine. GNU ld seems fine as well.

MaskRay added 2 commits August 4, 2025 23:29
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented Aug 7, 2025

Renamed the synthesizeAlign* functions. They still feel clunky but are probably better than the initial version.

Copy link
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me, but I am not confident in the LLD codebase enough to approve.

@MaskRay MaskRay requested review from mysterymath and smithp35 August 8, 2025 17:26
@MaskRay MaskRay merged commit 6f53f1c into main Aug 9, 2025
9 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/elf-synthesize-r_riscv_align-at-input-section-start branch August 9, 2025 01:40
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 9, 2025
Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

```
ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab
```

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the `.word 0x3a393837` data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236

Pull Request: llvm/llvm-project#151639
@nathanchance
Copy link
Member

I don't have time to reduce at the moment but I am seeing an issue when linking the Linux kernel's arch/riscv/purgatory/purgatory.ro when ThinLTO is enabled after this change.

$ make -skj"$(nproc)" ARCH=riscv LLVM=1 O=/tmp/build clean defconfig

$ scripts/config --file /tmp/build/.config -d LTO_NONE -e LTO_CLANG_THIN

$ timeout 30s make -skj"$(nproc)" ARCH=riscv LLVM=1 O=/tmp/build olddefconfig arch/riscv/purgatory/
malloc(): corrupted top size
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -e purgatory_start -z nodefaultlib arch/riscv/purgatory/purgatory.ro -o arch/riscv/purgatory/purgatory.chk
make[5]: *** [arch/riscv/purgatory/Makefile:104: arch/riscv/purgatory/purgatory.chk] Terminated
make[4]: *** [scripts/Makefile.build:556: arch/riscv/purgatory] Terminated
make[3]: *** [scripts/Makefile.build:556: arch/riscv] Terminated
make[2]: *** [Makefile:2011: .] Terminated
make[1]: *** [Makefile:248: __sub-make] Terminated
make: *** [Makefile:248: __sub-make] Terminated
# bad: [72a1fd1b43ac4d267d986c87c4e38f91b5bd872d] [AVR][NFC] Add a test for fp16 support (#152708)
# good: [f61526971f9c62118090443c8b97fab07ae9499f] Revert "[WebAssembly] Constant fold wasm.dot" (#152382)
git bisect start '72a1fd1b43ac4d267d986c87c4e38f91b5bd872d' 'f61526971f9c62118090443c8b97fab07ae9499f'
# good: [e977b28c37c174c1b93ad78314650e03b545f560] [InstCombine] Match intrinsic recurrences when known to be hoisted
git bisect good e977b28c37c174c1b93ad78314650e03b545f560
# good: [3a4b351ba18492b990b10fe5401c3bbaabcf2f94] [IR] Introduce the `ptrtoaddr` instruction
git bisect good 3a4b351ba18492b990b10fe5401c3bbaabcf2f94
# good: [8bfb54bab4434ab4eed1398ef46847b30a087bf7] [gn build] Port 4d3feaea66f4
git bisect good 8bfb54bab4434ab4eed1398ef46847b30a087bf7
# bad: [d1827f040f6e056e62cf4158bdf90d0acdf3d287] Add `REQUIRES: riscv` to test added in 151639 to skip the test when riscv is not built. (#152858)
git bisect bad d1827f040f6e056e62cf4158bdf90d0acdf3d287
# bad: [10e146a7161065429629a13f99c179a61ffe7721] [AMDGPU] Fix out of bound physreg tuple condition. NFC. (#152777)
git bisect bad 10e146a7161065429629a13f99c179a61ffe7721
# bad: [97f0ff0c80407adee693436b44e55ededfcd5435] [AVR] Fix Avr indvar detection and strength reduction (missed optimization) (#152028)
git bisect bad 97f0ff0c80407adee693436b44e55ededfcd5435
# good: [1acb1018d2ad8db4aaf5686b4f749e632828e690] [flang][cuda] Set correct bind(c) name for __popc (#152795)
git bisect good 1acb1018d2ad8db4aaf5686b4f749e632828e690
# bad: [6f53f1c8d2bdd13e30da7d1b85ed6a3ae4c4a856] [ELF] -r: Synthesize R_RISCV_ALIGN at input section start
git bisect bad 6f53f1c8d2bdd13e30da7d1b85ed6a3ae4c4a856
# good: [0c139883f4c086444e816f607105a96b617eb4a7] [libc] Fix server code when GPU is acting as the server
git bisect good 0c139883f4c086444e816f607105a96b617eb4a7
# first bad commit: [6f53f1c8d2bdd13e30da7d1b85ed6a3ae4c4a856] [ELF] -r: Synthesize R_RISCV_ALIGN at input section start

@nathanchance
Copy link
Member

Attached are the object files that link together to form purgatory.ro.

llvm-pr-151539-riscv-purgatory-issue.tgz

$ ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -r -e purgatory_start -z nodefaultlib *.o -o bad-purgatory.ro

$ ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -e purgatory_start -z nodefaultlib bad-purgatory.ro -o purgatory.chk
malloc(): corrupted top size
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -e purgatory_start -z nodefaultlib bad-purgatory.ro -o purgatory.chk
$ ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -r -e purgatory_start -z nodefaultlib *.o -o good-purgatory.ro

$ ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -e purgatory_start -z nodefaultlib good-purgatory.ro -o purgatory.chk

Linking bad-purgatory.o created with ld.lld from this change with a prior version of ld.lld in the second step reproduces the same malloc(): corrupted top size.

MaskRay added a commit that referenced this pull request Aug 13, 2025
…151639)

This reverts commit 6f53f1c.

synthesiedAligns is not cleared, leading to stray relocations for
unrelated sections. Revert for now.
MaskRay added a commit that referenced this pull request Aug 13, 2025
Clear `synthesizedAligns` to prevent stray relocations to an unrelated
text section. Enhance the test to check llvm-readelf -r output.

---

Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

```
ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab
```

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the `.word 0x3a393837` data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236
@MaskRay
Copy link
Member Author

MaskRay commented Aug 13, 2025

$ ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -r -e purgatory_start -z nodefaultlib *.o -o good-purgatory.ro

$ ld.lld -melf64lriscv -mllvm -mattr=+c -mllvm -mattr=+relax -mllvm -import-instr-limit=5 -z noexecstack -e purgatory_start -z nodefaultlib good-purgatory.ro -o purgatory.chk

Linking bad-purgatory.o created with ld.lld from this change with a prior version of ld.lld in the second step reproduces the same malloc(): corrupted top size.

Thanks for the detailed reproduce. The patch was missing a synthesizedAligns.clear(). Fixed by the reland
94655dc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants