Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
ce56f84
pre-commit test
yafet-a Aug 21, 2025
1c27d89
[BOLT] documentation
yafet-a Aug 21, 2025
db353b7
[BOLT][AArch64] Implement safe size-aware memcpy inlining
yafet-a Aug 21, 2025
2e5b22b
test target fix for CI cross-compilation issue
yafet-a Aug 22, 2025
385fa23
moved inline-memcpy to avoid CI cross-compilation PIE conflicts
yafet-a Aug 22, 2025
4f9ef67
removed old test
yafet-a Aug 22, 2025
e83126e
response to review
yafet-a Aug 22, 2025
cf8279a
Update conditional formatting and move check for size into binaryPasses
yafet-a Aug 27, 2025
c317eb0
Negative Tests (live-in, register move, non-mov instruction)
yafet-a Aug 27, 2025
df97d61
memcpy8 redundant handling removed
yafet-a Aug 27, 2025
25cfb58
nit: comment clean up
yafet-a Aug 27, 2025
e308855
minor refactor
yafet-a Aug 28, 2025
365a0bf
NFC: Post-review refactor
yafet-a Aug 28, 2025
84c904a
NFC: Test for corner case with size 0
yafet-a Aug 28, 2025
0561bcc
Use temp instead of argument registers
yafet-a Aug 28, 2025
cc49db7
Update early return
yafet-a Aug 28, 2025
115606b
Update tests to be more specific about registers + negative test on e…
yafet-a Aug 28, 2025
1986bfa
Complex test + register aliasing
yafet-a Aug 29, 2025
bd990ea
NFC use if initializer
yafet-a Sep 1, 2025
ee5f859
[style] trailing whitespaces removed
yafet-a Sep 4, 2025
ad503a7
[test] CHECK-NEXT used
yafet-a Sep 4, 2025
267432a
[test] updated negative test to check for negative size
yafet-a Sep 4, 2025
198744d
[nfc] minor refactor
yafet-a Sep 4, 2025
62b871e
[bug] memcpy call removed for sizes>64
yafet-a Sep 4, 2025
dcab6ac
[nfc][test] reordered test
yafet-a Sep 5, 2025
875156e
[nfc] added assert for default case (future-proofing for changes to B…
yafet-a Sep 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion bolt/lib/Passes/BinaryPasses.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1871,7 +1871,7 @@ Error InlineMemcpy::runOnFunctions(BinaryContext &BC) {
std::optional<uint64_t> KnownSize =
BC.MIB->findMemcpySizeInBytes(BB, II);

if (BC.isAArch64() && !KnownSize.has_value())
if (BC.isAArch64() && (!KnownSize.has_value() || *KnownSize > 64))
continue;

const InstructionListType NewCode =
Expand Down
3 changes: 1 addition & 2 deletions bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2706,8 +2706,7 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
Remaining -= OpSize;
Offset += OpSize;
}
} else
Code.clear();
}
break;
}
return Code;
Expand Down
9 changes: 4 additions & 5 deletions bolt/test/runtime/AArch64/inline-memcpy.s
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
# RUN: llvm-bolt %t.exe --inline-memcpy -o %t.bolt 2>&1 | FileCheck %s --check-prefix=CHECK-INLINE
# RUN: llvm-objdump -d %t.bolt | FileCheck %s --check-prefix=CHECK-ASM

# Verify BOLT reports that it inlined memcpy calls (12 successful inlines out of 16 total calls)
# CHECK-INLINE: BOLT-INFO: inlined 12 memcpy() calls
# Verify BOLT reports that it inlined memcpy calls (11 successful inlines out of 16 total calls)
# CHECK-INLINE: BOLT-INFO: inlined 11 memcpy() calls

# Each function should use optimal size-specific instructions and NO memcpy calls

Expand Down Expand Up @@ -68,10 +68,9 @@
# CHECK-ASM-NOT: str
# CHECK-ASM-NOT: bl{{.*}}<memcpy

# 128-byte copy should be "inlined" by removing the call entirely (too large for real inlining)
# 128-byte copy should NOT be inlined (too large, original call preserved)
# CHECK-ASM-LABEL: <test_128_byte_too_large>:
# CHECK-ASM-NOT: bl{{.*}}<memcpy
# CHECK-ASM-NOT: ldr{{.*}}q{{[0-9]+}}
# CHECK-ASM: bl{{.*}}<memcpy

# ADD immediate with non-zero source should NOT be inlined (can't track mov+add chain)
# CHECK-ASM-LABEL: <test_4_byte_add_immediate>:
Expand Down
Loading