-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[lld][ELF] filter out section symbols when use BP reorder #151685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-elf Author: Pengying Xu (Colibrow) ChangesWhen using Temporal Profiling with the BP algorithm, we encounter an issue with the internal function reorder. In cases where the symbol table contains entries like: The zero-sized section symbol .text.L1 gets stored in the secToSym map first. However, when the function lookup searches for L1 (as seen in BPSectionOrdererBase.inc:191), it fails to find the correct entry in rootSymbolToSectionIdxs because the section symbol has already claimed that slot. Full diff: https://github.com/llvm/llvm-project/pull/151685.diff 2 Files Affected:
diff --git a/lld/ELF/BPSectionOrderer.cpp b/lld/ELF/BPSectionOrderer.cpp
index 06152046d13d4..3a98ea58ec4c6 100644
--- a/lld/ELF/BPSectionOrderer.cpp
+++ b/lld/ELF/BPSectionOrderer.cpp
@@ -79,7 +79,8 @@ DenseMap<const InputSectionBase *, int> elf::runBalancedPartitioning(
// Skip empty, discarded, ICF folded sections, .bss. Skipping ICF folded
// sections reduces duplicate detection work in BPSectionOrderer.
if (!sec || sec->size == 0 || !sec->isLive() || sec->repl != sec ||
- !sec->content().data() || !orderer.secToSym.try_emplace(sec, d).second)
+ !sec->content().data() || d->size == 0 ||
+ !orderer.secToSym.try_emplace(sec, d).second)
return;
rootSymbolToSectionIdxs[CachedHashStringRef(
lld::utils::getRootSymbol(sym.getName()))]
diff --git a/lld/test/ELF/bp-section-orderer.s b/lld/test/ELF/bp-section-orderer.s
index 438d7c2da0f76..9a83c70dba315 100644
--- a/lld/test/ELF/bp-section-orderer.s
+++ b/lld/test/ELF/bp-section-orderer.s
@@ -21,38 +21,38 @@
# RUN: llvm-profdata merge a.proftext -o a.profdata
# RUN: ld.lld a.o --irpgo-profile=a.profdata --bp-startup-sort=function --verbose-bp-section-orderer --icf=all --gc-sections 2>&1 | FileCheck %s --check-prefix=STARTUP-FUNC-ORDER
-# STARTUP-FUNC-ORDER: Ordered 3 sections ([[#]] bytes) using balanced partitioning
-# STARTUP-FUNC-ORDER: Total area under the page fault curve: 3.
+# STARTUP-FUNC-ORDER: Ordered 4 sections ([[#]] bytes) using balanced partitioning
+# STARTUP-FUNC-ORDER: Total area under the page fault curve: 4.
# RUN: ld.lld -o out.s a.o --irpgo-profile=a.profdata --bp-startup-sort=function
# RUN: llvm-nm -jn out.s | tr '\n' , | FileCheck %s --check-prefix=STARTUP
-# STARTUP: s5,s4,s3,s2,s1,A,B,C,F,E,D,merged1,merged2,_start,d4,d3,d2,d1,g1,{{$}}
+# STARTUP: s5,s4,s3,s2,s1,A,B,C,L1,F,E,D,merged1,merged2,G,_start,d4,d3,d2,d1,g1,{{$}}
# RUN: ld.lld -o out.os a.o --irpgo-profile=a.profdata --bp-startup-sort=function --symbol-ordering-file a.txt
# RUN: llvm-nm -jn out.os | tr '\n' , | FileCheck %s --check-prefix=ORDER-STARTUP
-# ORDER-STARTUP: s2,s1,s5,s4,s3,A,F,E,D,B,C,merged1,merged2,_start,d3,d2,d4,d1,g1,{{$}}
+# ORDER-STARTUP: s2,s1,s5,s4,s3,A,F,E,D,B,C,L1,merged1,merged2,G,_start,d3,d2,d4,d1,g1,{{$}}
# RUN: ld.lld -o out.cf a.o --verbose-bp-section-orderer --bp-compression-sort=function 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-FUNC
# RUN: ld.lld -o out.cf.icf a.o --verbose-bp-section-orderer --bp-compression-sort=function --icf=all --gc-sections 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-ICF-FUNC
# RUN: llvm-nm -jn out.cf | tr '\n' , | FileCheck %s --check-prefix=CFUNC
-# CFUNC: s5,s4,s3,s2,s1,A,F,merged1,merged2,C,E,D,B,_start,d4,d3,d2,d1,g1,{{$}}
+# CFUNC: s5,s4,s3,s2,s1,C,E,D,B,G,F,merged1,merged2,A,_start,L1,d4,d3,d2,d1,g1,{{$}}
# RUN: ld.lld -o out.cd a.o --verbose-bp-section-orderer --bp-compression-sort=data 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-DATA
# RUN: llvm-nm -jn out.cd | tr '\n' , | FileCheck %s --check-prefix=CDATA
-# CDATA: s5,s3,s4,s2,s1,F,C,E,D,B,A,merged1,merged2,_start,d4,d1,d3,d2,g1,{{$}}
+# CDATA: s5,s3,s4,s2,s1,F,C,E,D,B,A,merged1,merged2,L1,G,_start,d4,d1,d3,d2,g1,{{$}}
# RUN: ld.lld -o out.cb a.o --verbose-bp-section-orderer --bp-compression-sort=both 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-BOTH
# RUN: llvm-nm -jn out.cb | tr '\n' , | FileCheck %s --check-prefix=CBOTH
-# CBOTH: s5,s3,s4,s2,s1,A,F,merged1,merged2,C,E,D,B,_start,d4,d1,d3,d2,g1,{{$}}
+# CBOTH: s5,s3,s4,s2,s1,C,E,D,B,G,F,merged1,merged2,A,_start,L1,d4,d1,d3,d2,g1,{{$}}
# RUN: ld.lld -o out.cbs a.o --verbose-bp-section-orderer --bp-compression-sort=both --irpgo-profile=a.profdata --bp-startup-sort=function 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-BOTH
# RUN: llvm-nm -jn out.cbs | tr '\n' , | FileCheck %s --check-prefix=CBOTH-STARTUP
-# CBOTH-STARTUP: s5,s3,s4,s2,s1,A,B,C,F,E,D,merged1,merged2,_start,d4,d1,d3,d2,g1,{{$}}
+# CBOTH-STARTUP: s5,s3,s4,s2,s1,A,B,C,L1,F,E,D,merged1,merged2,G,_start,d4,d1,d3,d2,g1,{{$}}
-# BP-COMPRESSION-FUNC: Ordered 9 sections ([[#]] bytes) using balanced partitioning
-# BP-COMPRESSION-ICF-FUNC: Ordered 8 sections ([[#]] bytes) using balanced partitioning
+# BP-COMPRESSION-FUNC: Ordered 11 sections ([[#]] bytes) using balanced partitioning
+# BP-COMPRESSION-ICF-FUNC: Ordered 9 sections ([[#]] bytes) using balanced partitioning
# BP-COMPRESSION-DATA: Ordered 9 sections ([[#]] bytes) using balanced partitioning
-# BP-COMPRESSION-BOTH: Ordered 18 sections ([[#]] bytes) using balanced partitioning
+# BP-COMPRESSION-BOTH: Ordered 20 sections ([[#]] bytes) using balanced partitioning
#--- a.proftext
:ir
@@ -63,7 +63,7 @@
1
# Weight
1
-A, B, C
+A, B, C, L1
A
# Func Hash:
@@ -97,6 +97,14 @@ D
# Counter Values:
1
+L1
+# Func Hash:
+5555
+# Num Counters:
+1
+# Counter Values:
+1
+
#--- a.txt
A
F
@@ -137,6 +145,9 @@ void A() {}
RETAIN int merged1(int a) { return F(a + 101); }
int merged2(int a) { return F(a + 101); }
+RETAIN static int L1(int a) { return a + 103; }
+int G(int a) { return L1(a); }
+
int _start() { return 0; }
#--- gen
@@ -148,7 +159,7 @@ clang --target=aarch64-linux-gnu -O0 -ffunction-sections -fdata-sections -fno-as
.p2align 2
.type F,@function
F: // @F
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -167,7 +178,7 @@ F: // @F
.p2align 2
.type C,@function
C: // @C
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -186,7 +197,7 @@ C: // @C
.p2align 2
.type E,@function
E: // @E
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -205,7 +216,7 @@ E: // @E
.p2align 2
.type D,@function
D: // @D
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -224,7 +235,7 @@ D: // @D
.p2align 2
.type B,@function
B: // @B
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -243,7 +254,7 @@ B: // @B
.p2align 2
.type A,@function
A: // @A
-// %bb.0: // %entry
+// %bb.0:
ret
.Lfunc_end5:
.size A, .Lfunc_end5-A
@@ -253,7 +264,7 @@ A: // @A
.p2align 2
.type merged1,@function
merged1: // @merged1
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -272,7 +283,7 @@ merged1: // @merged1
.p2align 2
.type merged2,@function
merged2: // @merged2
-// %bb.0: // %entry
+// %bb.0:
sub sp, sp, #32
stp x29, x30, [sp, #16] // 16-byte Folded Spill
add x29, sp, #16
@@ -286,16 +297,48 @@ merged2: // @merged2
.Lfunc_end7:
.size merged2, .Lfunc_end7-merged2
// -- End function
+ .section .text.L1,"axR",@progbits
+ .p2align 2 // -- Begin function L1
+ .type L1,@function
+L1: // @L1
+// %bb.0:
+ sub sp, sp, #16
+ str w0, [sp, #12]
+ ldr w8, [sp, #12]
+ add w0, w8, #103
+ add sp, sp, #16
+ ret
+.Lfunc_end8:
+ .size L1, .Lfunc_end8-L1
+ // -- End function
+ .section .text.G,"ax",@progbits
+ .globl G // -- Begin function G
+ .p2align 2
+ .type G,@function
+G: // @G
+// %bb.0:
+ sub sp, sp, #32
+ stp x29, x30, [sp, #16] // 16-byte Folded Spill
+ add x29, sp, #16
+ stur w0, [x29, #-4]
+ ldur w0, [x29, #-4]
+ bl L1
+ ldp x29, x30, [sp, #16] // 16-byte Folded Reload
+ add sp, sp, #32
+ ret
+.Lfunc_end9:
+ .size G, .Lfunc_end9-G
+ // -- End function
.section .text._start,"ax",@progbits
.globl _start // -- Begin function _start
.p2align 2
.type _start,@function
_start: // @_start
-// %bb.0: // %entry
+// %bb.0:
mov w0, wzr
ret
-.Lfunc_end8:
- .size _start, .Lfunc_end8-_start
+.Lfunc_end10:
+ .size _start, .Lfunc_end10-_start
// -- End function
.type s5,@object // @s5
.section .rodata.s5,"a",@progbits
@@ -395,3 +438,4 @@ g1:
.addrsig_sym B
.addrsig_sym A
.addrsig_sym merged1
+ .addrsig_sym L1
|
|
Even ignoring section symbols, it doesn't make sense to attempt to order zero sized symbols. I think this could also be applied to MachO. Could you add that too? |
lld/test/ELF/bp-section-orderer.s
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this section has zero size? Can you also explain why these sections would have zero size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: Remove the d->size == 0 check and use sym.isSection() to skip reordering section symbols. I found a similar problem, see https://reviews.llvm.org/D59311?id=190644#inline-528590.
This piece of code creates three symbols:
.text.L1: aSTT_SECTIONsymbol where the symbol name is empty butd->section->sizeis not 0L1: aSTT_FUNCsymbol, which is the one we want in Temporal Profiling$x: a mapping symbol
These three symbols all match the section .text.L1, and sym.isSection() helps us filter out the unnamed section symbols in BP reorder.
@ellishg
c3c61f9 to
f722669
Compare
lld/ELF/BPSectionOrderer.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment: Skip section symbols.
Just use "section symbols" |
f722669 to
4395291
Compare
4395291 to
ee008c0
Compare
ee008c0 to
3cdef85
Compare
|
@MaskRay Thanks for the review. I've addressed all the review comments. Could you please take another look and merge it if everything looks good? |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/22289 Here is the relevant piece of the build log for the reference |
When using Temporal Profiling with the BP algorithm, we encounter an issue with the internal function reorder. In cases where the symbol table contains entries like:
The zero-sized section symbol .text.L1 gets stored in the secToSym map first. However, when the function lookup searches for L1 (as seen in BPSectionOrdererBase.inc:191), it fails to find the correct entry in rootSymbolToSectionIdxs because the section symbol has already claimed that slot.
This patch fixes the issue by skipping zero-sized symbols during the addSections process, ensuring that function symbols are properly registered for lookup.