Skip to content

Conversation

@aaupov
Copy link
Contributor

@aaupov aaupov commented Mar 3, 2025

We used to report PLT traces as invalid (mismatching disassembled
function contents) because PLT functions are marked as pseudo and
ignored, thus missing CFG. However, such traces are not mismatching
the function contents. Accept them without attaching the profile.

Test Plan: updated callcont-fallthru.s

aaupov added 2 commits March 2, 2025 22:24
Created using spr 1.3.4
@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2025

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

We used to report PLT traces as invalid (mismatching disassembled
function contents) because PLT functions are marked as pseudo and
ignored, thus missing CFG. However, such traces are not mismatching
the function contents. Accept them without attaching the profile.

Test Plan: updated callcont-fallthru.s


Full diff: https://github.com/llvm/llvm-project/pull/129481.diff

3 Files Affected:

  • (modified) bolt/lib/Profile/DataAggregator.cpp (+11-5)
  • (modified) bolt/test/X86/callcont-fallthru.s (+10)
  • (modified) bolt/test/link_fdata.py (+1-1)
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index ae0355f09afc2..dec82a3c06083 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -845,11 +845,6 @@ DataAggregator::getFallthroughsInTrace(BinaryFunction &BF,
 
   BinaryContext &BC = BF.getBinaryContext();
 
-  if (BF.empty())
-    return std::nullopt;
-
-  assert(BF.hasCFG() && "can only record traces in CFG state");
-
   // Offsets of the trace within this function.
   const uint64_t From = FirstLBR.To - BF.getAddress();
   const uint64_t To = SecondLBR.From - BF.getAddress();
@@ -857,6 +852,17 @@ DataAggregator::getFallthroughsInTrace(BinaryFunction &BF,
   if (From > To)
     return std::nullopt;
 
+  // Accept fall-throughs inside pseudo functions (PLT/thunks). Such functions
+  // are marked as ignored and so lack CFG, which means fallthroughs in them are
+  // reported as mismatching traces which they are not.
+  if (BF.isPseudo())
+    return Branches;
+
+  if (BF.empty())
+    return std::nullopt;
+
+  assert(BF.hasCFG() && "can only record traces in CFG state");
+
   const BinaryBasicBlock *FromBB = BF.getBasicBlockContainingOffset(From);
   const BinaryBasicBlock *ToBB = BF.getBasicBlockContainingOffset(To);
 
diff --git a/bolt/test/X86/callcont-fallthru.s b/bolt/test/X86/callcont-fallthru.s
index 8df5ce8794e93..44e3bf21c14c0 100644
--- a/bolt/test/X86/callcont-fallthru.s
+++ b/bolt/test/X86/callcont-fallthru.s
@@ -6,6 +6,7 @@
 # RUN: %clangxx %cxxflags %s %t.so -o %t -Wl,-q -nostdlib
 # RUN: link_fdata %s %t %t.pat PREAGGT1
 # RUN: link_fdata %s %t %t.pat2 PREAGGT2
+# RUN: link_fdata %s %t %t.patplt PREAGGPLT
 
 # RUN: llvm-strip --strip-unneeded %t -o %t.strip
 # RUN: llvm-objcopy --remove-section=.eh_frame %t.strip %t.noeh
@@ -23,6 +24,12 @@
 # RUN: llvm-bolt %t.strip --pa -p %t.pat2 -o %t.out \
 # RUN:   --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK3
 
+## Check pre-aggregated traces don't report zero-sized PLT fall-through as
+## invalid trace
+# RUN: llvm-bolt %t.strip --pa -p %t.patplt -o %t.out | FileCheck %s \
+# RUN:   --check-prefix=CHECK-PLT
+# CHECK-PLT: traces mismatching disassembled function contents: 0
+
   .globl foo
   .type foo, %function
 foo:
@@ -46,7 +53,10 @@ main:
 	movl	$0x0, -0x4(%rbp)
 	movl	%edi, -0x8(%rbp)
 	movq	%rsi, -0x10(%rbp)
+Ltmp0_br:
 	callq	puts@PLT
+## Check PLT traces are accepted
+# PREAGGPLT: T #Ltmp0_br# #puts@plt# #puts@plt# 3
 ## Target is an external-origin call continuation
 # PREAGGT1: T X:0 #Ltmp1# #Ltmp4_br# 2
 # CHECK:      callq puts@PLT
diff --git a/bolt/test/link_fdata.py b/bolt/test/link_fdata.py
index 028823a69ce00..4555e6b0479ae 100755
--- a/bolt/test/link_fdata.py
+++ b/bolt/test/link_fdata.py
@@ -85,7 +85,7 @@
 
 # Read nm output: <symbol value> <symbol type> <symbol name>
 nm_output = subprocess.run(
-    [args.nmtool, "--defined-only", args.objfile], text=True, capture_output=True
+    [args.nmtool, "--defined-only", "--synthetic", args.objfile], text=True, capture_output=True
 ).stdout
 # Populate symbol map
 symbols = {}

@github-actions
Copy link

github-actions bot commented Mar 3, 2025

✅ With the latest revision this PR passed the Python code formatter.

Created using spr 1.3.4
Created using spr 1.3.4
V-FEXrt and others added 2 commits April 11, 2025 17:56
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@aaupov aaupov changed the base branch from users/aaupov/spr/main.bolt-accept-plt-fall-throughs-as-valid-traces to main April 12, 2025 01:32
@aaupov aaupov merged commit fa4ac19 into main Apr 12, 2025
13 of 16 checks passed
@aaupov aaupov deleted the users/aaupov/spr/bolt-accept-plt-fall-throughs-as-valid-traces branch April 12, 2025 04:26
Copy link
Member

@paschalis-mpeis paschalis-mpeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey folks,

Whilecallcont-fallthru.s is an x86-specific test, it can be attempted on non-x86 targets that do build the x86 target. And currently our AArch64 BOLT buildbot fails (it's internal for now, but we are working on making it public eventually).

@aaupov, can we REQUIRE this test to run only on x86_64-linux?


It's actually an assertion on link_fdata.py that causes the test to fail:

AssertionError: ERROR: symbol puts@plt is not defined in binary

Interestingly, on some other AArch64 host I've checked the assertion is not raised and somehow the tests passes. Still, it is not meaningful for non-x86 targets.

@aaupov
Copy link
Contributor Author

aaupov commented Apr 14, 2025

@paschalis-mpeis thanks for a heads up, I'll fix the test

@aaupov
Copy link
Contributor Author

aaupov commented Apr 15, 2025

@paschalis-mpeis please check if #135867 fixes the issue on your end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants