-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[BOLT] [PowerPC] Port #140894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[BOLT] [PowerPC] Port #140894
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
That's exciting to see BOLT getting PowerPC support! Could you share more details about the project? Were you able to see real gains or is this still WIP? |
Thank you for the feedback! |
Hi @maksfb, I did git push force recently. I noticed the [BOLT] label disappeared. Could it be re-applied manually? Thanks! |
@llvm/pr-subscribers-bolt Author: Kostas (kostasalv) ChangesScopeThis patch represents an initial portion of the ongoing work to port [BOLT] for [PowerPC] architecture. What does this patch do?This pull request ports the [BOLT] createPushRegisters method to [PowerPC] target. That helps [BOLT] generate MotivationPort [BOLT] for [PowerPC] architecture. Testing
Full diff: https://github.com/llvm/llvm-project/pull/140894.diff 6 Files Affected:
diff --git a/bolt/CMakeLists.txt b/bolt/CMakeLists.txt
index 52c796518ac05..c18216d760808 100644
--- a/bolt/CMakeLists.txt
+++ b/bolt/CMakeLists.txt
@@ -58,7 +58,7 @@ endif() # standalone
# Determine default set of targets to build -- the intersection of
# those BOLT supports and those LLVM is targeting.
-set(BOLT_TARGETS_TO_BUILD_all "AArch64;X86;RISCV")
+set(BOLT_TARGETS_TO_BUILD_all "AArch64;X86;RISCV;PowerPC")
set(BOLT_TARGETS_TO_BUILD_default)
foreach (tgt ${BOLT_TARGETS_TO_BUILD_all})
if (tgt IN_LIST LLVM_TARGETS_TO_BUILD)
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 132d58f3f9f79..fa043a49bb87e 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -2293,6 +2293,11 @@ MCPlusBuilder *createRISCVMCPlusBuilder(const MCInstrAnalysis *,
const MCRegisterInfo *,
const MCSubtargetInfo *);
+MCPlusBuilder *createPowerPCMCPlusBuilder(const MCInstrAnalysis *,
+ const MCInstrInfo *,
+ const MCRegisterInfo *,
+ const MCSubtargetInfo *);
+
} // namespace bolt
} // namespace llvm
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index ad062ea3622d1..dbd9ecbaf1c66 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -304,6 +304,11 @@ MCPlusBuilder *createMCPlusBuilder(const Triple::ArchType Arch,
return createRISCVMCPlusBuilder(Analysis, Info, RegInfo, STI);
#endif
+#ifdef POWERPC_AVAILABLE
+ if (Arch == Triple::ppc64 || Arch == Triple::ppc64le)
+ return createPowerPCMCPlusBuilder(Analysis, Info, RegInfo, STI);
+#endif
+
llvm_unreachable("architecture unsupported by MCPlusBuilder");
}
diff --git a/bolt/lib/Target/CMakeLists.txt b/bolt/lib/Target/CMakeLists.txt
index eae8ebdddbf3f..38d423ac9483c 100644
--- a/bolt/lib/Target/CMakeLists.txt
+++ b/bolt/lib/Target/CMakeLists.txt
@@ -1,3 +1,5 @@
foreach (tgt ${BOLT_TARGETS_TO_BUILD})
add_subdirectory(${tgt})
-endforeach()
+ string(TOUPPER ${tgt} TGT_UPPER)
+ add_definitions(-D${TGT_UPPER}_AVAILABLE)
+endforeach()
\ No newline at end of file
diff --git a/bolt/lib/Target/PowerPC/CMakeLists.txt b/bolt/lib/Target/PowerPC/CMakeLists.txt
new file mode 100644
index 0000000000000..c1d2a054396d7
--- /dev/null
+++ b/bolt/lib/Target/PowerPC/CMakeLists.txt
@@ -0,0 +1,29 @@
+add_llvm_library(LLVMBOLTTargetPowerPC
+ PPCMCPlusBuilder.cpp
+)
+
+target_include_directories(LLVMBOLTTargetPowerPC PRIVATE
+ ${LLVM_BINARY_DIR}/include
+ ${LLVM_SOURCE_DIR}/include
+)
+
+file(MAKE_DIRECTORY "${LLVM_BINARY_DIR}/include/llvm/Target/PowerPC")
+
+foreach(incfile IN ITEMS
+ PPCGenInstrInfo.inc
+ PPCGenRegisterInfo.inc
+)
+ add_custom_command(
+ OUTPUT "${LLVM_BINARY_DIR}/include/llvm/Target/PowerPC/${incfile}"
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different
+ "${LLVM_BINARY_DIR}/lib/Target/PowerPC/${incfile}"
+ "${LLVM_BINARY_DIR}/include/llvm/Target/PowerPC/${incfile}"
+ DEPENDS "${LLVM_BINARY_DIR}/lib/Target/PowerPC/${incfile}"
+ COMMENT "Copying ${incfile} to include directory"
+ )
+ add_custom_target(
+ "BoltCopy${incfile}" ALL
+ DEPENDS "${LLVM_BINARY_DIR}/include/llvm/Target/PowerPC/${incfile}"
+ )
+ add_dependencies(LLVMBOLTTargetPowerPC "BoltCopy${incfile}")
+endforeach()
\ No newline at end of file
diff --git a/bolt/lib/Target/PowerPC/PPCMCPlusBuilder.cpp b/bolt/lib/Target/PowerPC/PPCMCPlusBuilder.cpp
new file mode 100644
index 0000000000000..39d5ed2d3e36e
--- /dev/null
+++ b/bolt/lib/Target/PowerPC/PPCMCPlusBuilder.cpp
@@ -0,0 +1,54 @@
+//===- bolt/Target/PowerPC/PPCMCPlusBuilder.cpp -----------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file provides PowerPC-specific MCPlus builder.
+//
+//===----------------------------------------------------------------------===//
+
+#include "bolt/Core/MCPlusBuilder.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#define GET_INSTRINFO_ENUM
+#include "llvm/Target/PowerPC/PPCGenInstrInfo.inc"
+#define GET_REGINFO_ENUM
+#include "llvm/Target/PowerPC/PPCGenRegisterInfo.inc"
+
+namespace llvm {
+namespace bolt {
+
+class PPCMCPlusBuilder : public MCPlusBuilder {
+public:
+ using MCPlusBuilder::MCPlusBuilder;
+
+ // Create instructions to push two registers onto the stack
+ static void createPushRegisters(MCInst &Inst1, MCInst &Inst2, MCPhysReg Reg1,
+ MCPhysReg /*Reg2*/) {
+
+ Inst1.clear();
+ Inst1.setOpcode(PPC::STDU);
+ Inst1.addOperand(MCOperand::createReg(PPC::R1)); // destination (SP)
+ Inst1.addOperand(MCOperand::createReg(PPC::R1)); // base (SP)
+ Inst1.addOperand(MCOperand::createImm(-16)); // offset
+
+ Inst2.clear();
+ Inst2.setOpcode(PPC::STD);
+ Inst2.addOperand(MCOperand::createReg(Reg1)); // source register
+ Inst2.addOperand(MCOperand::createReg(PPC::R1)); // base (SP)
+ Inst2.addOperand(MCOperand::createImm(0)); // offset
+ }
+};
+
+MCPlusBuilder *createPowerPCMCPlusBuilder(const MCInstrAnalysis *Analysis,
+ const MCInstrInfo *Info,
+ const MCRegisterInfo *RegInfo,
+ const MCSubtargetInfo *STI) {
+ return new PPCMCPlusBuilder(Analysis, Info, RegInfo, STI);
+}
+
+} // namespace bolt
+} // namespace llvm
|
Gentle ping @aaupov @maksfb @rafaelauler @paschalis-mpeis @yota9 @ayermolo — when you have a moment, could you please take a look at this PR? |
Hello @kostasalv ! Thank you for your work. But from my side I would say that maybe this port is too basic. I mean I would expect at least hello world test to be compiled and optimized to submit this PR. This is my IMHO, it's up to @maksfb to decide at the end, but maybe it would be possible to add couple of relocations & etc support to at least run couple very basic C (or even asm) tests with BOLT? thank you! |
Hi, @kostasalv, The PR serves well as a skeleton for the backend. For a minimum implementation, I agree with @yota9 that it's reasonable to expect it to be able to process a simple program such as "Hello world". I suggest you add a simple symbolic disassembler for PowerPC similar to Regarding the current PR, what was the reason for copying |
Thank you very much for your guidance and for the review comments. I also plan to add a basic symbolic disassembler for PowerPC, based on X86MCSymbolizer, and start supporting some PowerPC-specific relocations as suggested. At this time, I don’t have any additional patches on top of this one, but I will be sure to share any updates as they become available. Thank you as well for the suggestion regarding PPCMCTargetDesc.h. I will update the code to include this header and remove the copying of *.inc files in the CMakeLists. |
@kostasalv, sounds good. You can take a look at other targets (x86, aarch64, RISC-V), they follow pretty much the same pattern in respective For the "Hello world" test, you can start with building CFG and verifying a couple of symbolic reference resolutions. x86-64 and AArch64 have their versions of symbolic disassembler ( |
45a1ebb
to
8f0dfc1
Compare
3d56cb2
to
802fcb1
Compare
Introduce a PowerPC64 implementation of PLT section disassembly in RewriteInstance. This enables BOLT to properly identify and skip PLT stubs on PPC64, mirroring existing handing for the other targets. With this change, BOLT no longer attempts to treat PPC64 PLT stubs as regular functions, avoiding crashes and mis-optimisation. This lays the groundwork for running BOLT on dynamically linked PPC64 binaries that depend on libc calls (e.g. 'printf')
BOLT relies on treating tail calls as calls to build correct CFGs and preserve interprocedural analysis. On PowerPC, the compiler often emits plain branches (b, ba, bctr) instead of explicit calls when lowering tail calls. (TCO Tail Call Optimisation). This patch overrides convertJmpToTailCall in PPCMCPlusBuilder so that uncoditional branches are converted into their link-bit variants (BL, BLA, BCTRL). This way, BOLT can consistently recognise them as calls/tail calls during dissasembly and optimisation, including when skipping PLT (Procedure Linkage Table) stubs.
…ion. This patch introduces initial PowerPC relocation handing in createRelocation() by mapping common fixup kinds to ELF relaction types: *24-bit PC-relative branches (B/BL) -> R_PPC64_REL24 *14-bit conditional branches (BC/BCL) -> R_PPC64_REL14 Unrecognised fixup kinds return std::nullopt. Additionaly, getPCRelOperandNum() is updated to correctly identify the operand index of branch targets.
98be163
to
d3bee76
Compare
This patch adds isIndirectBranch() implementation so that disassemble() can correctly distinguish between direct and indirect branches. This prevents replaceBranchTarget() from being called on instructions without PC-relative operands and avoids assertion failures.
…ranch target isn't at a fixed operand index. Make getPCRelOperandNum() search for last MCOperand that is an expression, instead of hard-coded indices. Enhance createRelocation() to handle TOC relocations (not just REL24/REL14). To fix the JITLink assert "NOP should be placed here for restoring r2". JITLink coudn't see TOC-related edges and couldn't patch sequences correctly. I had mapping only for REL24/REL14.
…restore. Detect calls without a following NOP and inject one during emission to prevent ppc64 JITLink assertions and preserve TOC (r2) handling.
… the emitter can enter the NOP-insertion block.
For other targets evaluateBranch can index MC operand info and compute target reliably. Though for PPC64 a bl often arrives without an explicit MC operand when disassembled from an object file. So the evaluateBranch() is called on an instruction shape it does not handle, leading to an assertion failure. In relocation mode, BOLT already resolves call targets via relocations, so calling `evaluateBranch` for PPC calls is unnecessary. Branches are still handled normally. This change skips evaluateBranch for PPC calls while preserving branch evaluation for contidional/unconditional jumps.
…tection Provide a PPC implementation of MCPlusBuilder::evaluateMemOperandTarget() that returns false (PPC64 with -mno-pcrel has no PC-relative memory operands). This avoids hitting the base-class llvm_unreachable during disassembly. Also stop reporting BA/BLA (absolute) as having a PC-relative operand. Keep PC-rel handling only for BL/B/BC where AA=0.
MCPlusBuilder::evaluateMemOperandTarget(...) BinaryFunction::handlePCRelOperand(...) BinaryFunction::disassemble() BOLT was crashing with: FATAL BOLT-ERROR: PC-relative operand can't be evaluated because `PPCMCPlusBuilder::hasPCRelOperand()` returned `true` for BL/BC, causing BOLT to try to evaluate the target directly from the MCInst. On PowerPC ELFv2, branch/call targets are typically resolved via relocations (e.g. R_PPC64_REL24) rather than embedded PC-rel immediates, so evaluation fails. This patch makes `hasPCRelOperand()` return `false` for PPC, and guards `replaceBranchTarget()` accordingly.
…nsert the NOP and avoid crashing llvm_unreachable.
This reverts commit 97c760b.
PowerPC branch instruction encode their displacement in different operand positions that other targets. The previous implementation incorrectly assumed operand #0 for all branches. This patch: * Corrects the getPCRelOperandNum() to return the proper operand index. (e.g. operand llvm#2 for 'bc') * Updates hasPCRelOperand() and evaluateBranch() to use the correct index. * Clarifies relative vs absolute (AA=0 vs AA=1) branches/calls. This fixes branch target evaluation and rewriting on PPC64 ELFv2 when BOLT analyszes control-flow.
On PPC64 ELFv2, the instruction slot immediately after a 'bl' is reserved for either a NOP or a TOC-restore sequence ('ld r2', '24(r1)'). I noticed that my latest code was inserting NOP in place of TOC-restore. This led to incorrect execution when the restore TOC value was lost. I implemented a new method 'isTOCRestoreAfterCall()' to check if TOC-restore is after main lb and to skip adding NOP.
50f39c2
to
9f0fa53
Compare
Remove my marking code in disassemble() method. More specifically remove fallback call disassembleInstructionAtOffset(NextOff). I 'peeked' the next address of bl call to see if it was NOP/TOP-restore but it inserted an additional ld 2, 24(1) into the internal structure.
…changed for PLT TOC restore. On PPC64 ELFv2, the PLT call linker may restore the TOC with 'ld 2, 24(1)' immediately after a the 'bl'. I had previously injected an adjustment (with createPushRegisters) into stack-pointer ('r1') between the 'bl' and restore which could cause crash (segmentation fault). Now I updated the createPushRegisters to return two NOPs ('ori r0, r0, 0') instead of touching the r1. Satisfying the generic two-instruction "save" contract expected by callers.
… to detect correctly TOC-restore. Fix to expect slot1 the IMM=24 and slot2 the BASE reg. They were flipped before.
Normalise PPC64 post-call slot so LLVM JITLink can apply TOC restore. PPC64 JITLink expect to see NOP after bl main. The static linker inserted already ld after bl call and that triggered an assertion in PPC64 JITLink. ("NOP should be placed here for restoring r2"). Fix: If the instruction immediately after a call is the canonical TOC restore ('ld r2, 24(r1)'), emit a NOP instead.
…DS/LO_DS".Add support for R_PPC64_TOC16_DS and R_PPC64_TOC16_LO_DS.
According to the ELFv2 ABI, the TOC base register (r2) point to the address .got + 0x8000. This patch computes the correct TOC base when processing PPC64 binaries and uses it to anchor TOC-relative relocations. This resolves warnings such as: "ignoring symbol .TOC. at 0x10027f00, which lies outside .got"
… *' has incompatible initializer of type 'ErrorOr<BinarySection &>", unwrap ErrorOr<BinarySection&> before using it.
Scope
This patch represents an initial portion of the ongoing work to port [BOLT] for [PowerPC] architecture.
Additional changes will follow to complete the port.
What does this patch do?
This pull request ports the [BOLT] createPushRegisters method to [PowerPC] target. That helps [BOLT] generate
[PowerPC] instructions to save registers on the stack.
Motivation
Port [BOLT] for [PowerPC] architecture.
Testing