[LLD] Fix crash on parsing ':ALIGN' in linker script #146723

parth-07 · 2025-07-02T15:24:09Z

The linker was crashing due to stack overflow when parsing ':ALIGN' in an output section description. This commit fixes the linker script parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions (readExpr) in linker script and the behavior of ScriptLexer::expect(...) utility. ScriptLexer::expect does not do anything if errors have already been encountered during linker script parsing. In particular, it never increments the current token position in the script file, even if the current token is the same as the expected token. This causes an infinite call cycle on parsing an expression such as '(4096)' when an error has already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

expect("("); // no-op, current token still points to '('
Expression *E = readExpr(); // The cycle continues...

Closes #146722

github-actions · 2025-07-02T15:24:28Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-07-02T15:24:59Z

@llvm/pr-subscribers-lld

Author: Parth (parth-07)

Changes

The linker was crashing due to stack overflow when parsing ':ALIGN' in an output section description. This commit fixes the linker script parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions (readExpr) in linker script and the behavior of ScriptLexer::expect(...) utility. ScriptLexer::expect does not do anything if errors have already been encountered during linker script parsing. In particular, it never increments the current token position in the script file, even if the current token is the same as the expected token. This causes an infinite call cycle on parsing an expression such as '(4096)' when an error has already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

expect("("); // no-op, current token still points to '('
Expression *E = readExpr(); // The cycle continues...

Closes #146722

Full diff: https://github.com/llvm/llvm-project/pull/146723.diff

2 Files Affected:

(modified) lld/ELF/ScriptParser.cpp (+3)
(modified) lld/test/ELF/linkerscript/align-section.test (+16-1)

diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
index 593d5636f2455..58b4239c4aefd 100644
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -1229,6 +1229,9 @@ SymbolAssignment *ScriptParser::readSymbolAssignment(StringRef name) {
 // This is an operator-precedence parser to parse a linker
 // script expression.
 Expr ScriptParser::readExpr() {
+  // Do not try to read expression if an error has already been encountered.
+  if (atEOF())
+    return {};
   // Our lexer is context-aware. Set the in-expression bit so that
   // they apply different tokenization rules.
   SaveAndRestore saved(lexState, State::Expr);
diff --git a/lld/test/ELF/linkerscript/align-section.test b/lld/test/ELF/linkerscript/align-section.test
index 7a28fef2076ed..db99dc82f5514 100644
--- a/lld/test/ELF/linkerscript/align-section.test
+++ b/lld/test/ELF/linkerscript/align-section.test
@@ -1,7 +1,22 @@
 # REQUIRES: x86
+# RUN: rm -rf %t && split-file %s %t
+
 # RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux /dev/null -o %t.o
-# RUN: ld.lld -o %t --script %s %t.o -shared
+# RUN: ld.lld -o %t.1.out --script %t/a.t %t.o -shared
 
 # lld shouldn't crash.
 
+#--- a.t
 SECTIONS { .foo : ALIGN(2M) {} }
+
+#--- b.t
+SECTIONS
+{
+  S :ALIGN(4096) {}
+}
+
+# RUN: not ld.lld -o /dev/null --script %t/b.t 2>&1 | FileCheck %s
+
+CHECK: error: {{.*}}b.t:3: malformed number: :
+CHECK: >>>   S :ALIGN(4096) {}
+CHECK: >>>

llvmbot · 2025-07-02T15:25:00Z

@llvm/pr-subscribers-lld-elf

Author: Parth (parth-07)

Changes

The linker was crashing due to stack overflow when parsing ':ALIGN' in an output section description. This commit fixes the linker script parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions (readExpr) in linker script and the behavior of ScriptLexer::expect(...) utility. ScriptLexer::expect does not do anything if errors have already been encountered during linker script parsing. In particular, it never increments the current token position in the script file, even if the current token is the same as the expected token. This causes an infinite call cycle on parsing an expression such as '(4096)' when an error has already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

expect("("); // no-op, current token still points to '('
Expression *E = readExpr(); // The cycle continues...

Closes #146722

Full diff: https://github.com/llvm/llvm-project/pull/146723.diff

2 Files Affected:

(modified) lld/ELF/ScriptParser.cpp (+3)
(modified) lld/test/ELF/linkerscript/align-section.test (+16-1)

diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
index 593d5636f2455..58b4239c4aefd 100644
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -1229,6 +1229,9 @@ SymbolAssignment *ScriptParser::readSymbolAssignment(StringRef name) {
 // This is an operator-precedence parser to parse a linker
 // script expression.
 Expr ScriptParser::readExpr() {
+  // Do not try to read expression if an error has already been encountered.
+  if (atEOF())
+    return {};
   // Our lexer is context-aware. Set the in-expression bit so that
   // they apply different tokenization rules.
   SaveAndRestore saved(lexState, State::Expr);
diff --git a/lld/test/ELF/linkerscript/align-section.test b/lld/test/ELF/linkerscript/align-section.test
index 7a28fef2076ed..db99dc82f5514 100644
--- a/lld/test/ELF/linkerscript/align-section.test
+++ b/lld/test/ELF/linkerscript/align-section.test
@@ -1,7 +1,22 @@
 # REQUIRES: x86
+# RUN: rm -rf %t && split-file %s %t
+
 # RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux /dev/null -o %t.o
-# RUN: ld.lld -o %t --script %s %t.o -shared
+# RUN: ld.lld -o %t.1.out --script %t/a.t %t.o -shared
 
 # lld shouldn't crash.
 
+#--- a.t
 SECTIONS { .foo : ALIGN(2M) {} }
+
+#--- b.t
+SECTIONS
+{
+  S :ALIGN(4096) {}
+}
+
+# RUN: not ld.lld -o /dev/null --script %t/b.t 2>&1 | FileCheck %s
+
+CHECK: error: {{.*}}b.t:3: malformed number: :
+CHECK: >>>   S :ALIGN(4096) {}
+CHECK: >>>

smithp35

Thanks for the fix. I've got a small suggestion that means we don't need the comment.

smithp35 · 2025-07-02T16:35:55Z

lld/ELF/ScriptParser.cpp

I think this will work, although I think it may be more idiomatic to use

if (errCount(ctx)) return 0;

That way you don't need the comment to explain that atEOF will return true if there's an error. The return 0 comes from other places where an ErrAlways has occurred, such as

ErrAlways(ctx) << loc << ": division by zero"; return 0;

Looking at the CI this may have caused a problem with
FAIL: lld :: ELF/linkerscript/custom-section-type.s (1500 of 3105)

2025-07-02T15:45:44.9280956Z ld.lld: warning: section type mismatch for progbits 2025-07-02T15:45:44.9282076Z >>> /home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/lld/test/ELF/linkerscript/Output/custom-section-type.s.tmp/mismatch.o:(progbits): SHT_NOTE 2025-07-02T15:45:44.9283202Z >>> output section progbits: SHT_PROGBITS 2025-07-02T15:45:44.9283407Z 2025-07-02T15:45:44.9283520Z ld.lld: warning: section type mismatch for expr 2025-07-02T15:45:44.9284363Z >>> /home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/lld/test/ELF/linkerscript/Output/custom-section-type.s.tmp/mismatch.o:(expr): Unknown 2025-07-02T15:45:44.9285740Z >>> output section expr: Unknown

Assuming this is related to the patch. It may be that we've terminated too early before enough context for the error message can be accumulated. Which may mean that the check needs to be put closer to the point where an infinite loop may occur. Or we need a different approach.

Thank you for your inputs @smithp35.

although I think it may be more idiomatic to use

if (errCount(ctx)) return 0;

I agree that errCount(ctx) would work as well, however, I think it helps to make the code flow easier to understand and more intuitive if the behavior of parsing expression is consistent for both the error-case and the actual end-of-file case, and atEOF() covers both these cases. Please let me know your thoughts on this.

Assuming this is related to the patch. It may be that we've terminated too early before enough context for the error message can be accumulated. Which may mean that the check needs to be put closer to the point where an infinite loop may occur. Or we need a different approach.

Yes, it was related to the patch. Thank you for sharing your thoughts. It turned out the error was due to the return value of getExpr() in the error path. The return value was an empty function object. It should be a 0-value equivalent of the lld::elf::Expr. I have fixed the issue.

I've got a small suggestion that means we don't need the comment.

I agree that the comment was a little redundant. I have removed the comment.

Thanks for the updates. This looks OK to me. I've added the maintainer MaskRay and MysteryMath, who has been involved in linker script parsing recently.

This looks ok to me as well. A few parser functions call atEOF() at the beginning, when they expect to consume at least one token.

lld/test/ELF/linkerscript/align-section.test

The linker was crashing due to stack overflow when parsing ':ALIGN' in an output section description. This commit fixes the linker script parser so that the crash does not happen. The root cause of the stack overflow is how we parse expressions (readExpr) in linker script and the behavior of ScriptLexer::expect(...) utility. ScriptLexer::expect does not do anything if errors have already been encountered during linker script parsing. In particular, it never increments the current token position in the script file, even if the current token is the same as the expected token. This causes an infinite call cycle on parsing an expression such as '(4096)' when an error has already been encountered. readExpr() calls readPrimary() readPrimary() calls readParenExpr() readParenExpr(): expect("("); // no-op, current token still points to '(' Expression *E = readExpr(); // The cycle continues... Closes llvm#146722 Signed-off-by: Parth Arora <[email protected]>

github-actions · 2025-07-06T17:23:09Z

@parth-07 Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

llvmbot added lld lld:ELF labels Jul 2, 2025

smithp35 reviewed Jul 2, 2025

View reviewed changes

parth-07 force-pushed the AlignParseIssue branch from 3906ba2 to be4cae1 Compare July 2, 2025 19:56

parth-07 requested a review from smithp35 July 2, 2025 20:01

parth-07 force-pushed the AlignParseIssue branch 2 times, most recently from ddb5aaf to 11364e6 Compare July 2, 2025 20:19

smithp35 requested review from MaskRay and mysterymath July 3, 2025 09:23

MaskRay reviewed Jul 3, 2025

View reviewed changes

lld/test/ELF/linkerscript/align-section.test Outdated Show resolved Hide resolved

MaskRay reviewed Jul 3, 2025

View reviewed changes

lld/test/ELF/linkerscript/align-section.test Outdated Show resolved Hide resolved

parth-07 force-pushed the AlignParseIssue branch from 11364e6 to 1183bb5 Compare July 3, 2025 17:39

parth-07 requested a review from MaskRay July 3, 2025 17:40

MaskRay approved these changes Jul 5, 2025

View reviewed changes

lld/test/ELF/linkerscript/align-section.test Outdated Show resolved Hide resolved

parth-07 force-pushed the AlignParseIssue branch from 1183bb5 to 2cbaacb Compare July 5, 2025 07:41

parth-07 requested a review from MaskRay July 5, 2025 11:20

MaskRay reviewed Jul 5, 2025

View reviewed changes

lld/test/ELF/linkerscript/align-section.test Outdated Show resolved Hide resolved

parth-07 force-pushed the AlignParseIssue branch from 2cbaacb to 47b05f0 Compare July 5, 2025 17:21

parth-07 requested a review from MaskRay July 5, 2025 17:21

MaskRay approved these changes Jul 6, 2025

View reviewed changes

MaskRay merged commit 923a3cc into llvm:main Jul 6, 2025
9 checks passed

[LLD] Fix crash on parsing ':ALIGN' in linker script #146723

[LLD] Fix crash on parsing ':ALIGN' in linker script #146723

Uh oh!

Conversation

parth-07 commented Jul 2, 2025

Uh oh!

github-actions bot commented Jul 2, 2025

Uh oh!

llvmbot commented Jul 2, 2025

Uh oh!

llvmbot commented Jul 2, 2025

Uh oh!

smithp35 left a comment

Choose a reason for hiding this comment

Uh oh!

smithp35 Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

smithp35 Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

parth-07 Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

parth-07 Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

parth-07 Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

smithp35 Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

MaskRay Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

parth-07 Jul 2, 2025 •

edited

Loading