-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Open
Labels
backend:RISC-VenhancementImproving things as opposed to bug fixing, e.g. new or missing featureImproving things as opposed to bug fixing, e.g. new or missing featurellvm:codegenllvm:codesizeCode size issuesCode size issuesmetaissueIssue to collect references to a group of similar or related issues.Issue to collect references to a group of similar or related issues.
Description
Machine Outliner can lead to significant size savings, even for 32-bit embedded targets, see https://www.linaro.org/blog/reducing-code-size-with-llvm-machine-outliner-on-32-bit-arm-targets/
Both Arm and AArch64 backends support variety of outlining strategies:
llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
Lines 8099 to 8178 in 89c95ef
| /// Constants defining how certain sequences should be outlined. | |
| /// This encompasses how an outlined function should be called, and what kind of | |
| /// frame should be emitted for that outlined function. | |
| /// | |
| /// \p MachineOutlinerDefault implies that the function should be called with | |
| /// a save and restore of LR to the stack. | |
| /// | |
| /// That is, | |
| /// | |
| /// I1 Save LR OUTLINED_FUNCTION: | |
| /// I2 --> BL OUTLINED_FUNCTION I1 | |
| /// I3 Restore LR I2 | |
| /// I3 | |
| /// RET | |
| /// | |
| /// * Call construction overhead: 3 (save + BL + restore) | |
| /// * Frame construction overhead: 1 (ret) | |
| /// * Requires stack fixups? Yes | |
| /// | |
| /// \p MachineOutlinerTailCall implies that the function is being created from | |
| /// a sequence of instructions ending in a return. | |
| /// | |
| /// That is, | |
| /// | |
| /// I1 OUTLINED_FUNCTION: | |
| /// I2 --> B OUTLINED_FUNCTION I1 | |
| /// RET I2 | |
| /// RET | |
| /// | |
| /// * Call construction overhead: 1 (B) | |
| /// * Frame construction overhead: 0 (Return included in sequence) | |
| /// * Requires stack fixups? No | |
| /// | |
| /// \p MachineOutlinerNoLRSave implies that the function should be called using | |
| /// a BL instruction, but doesn't require LR to be saved and restored. This | |
| /// happens when LR is known to be dead. | |
| /// | |
| /// That is, | |
| /// | |
| /// I1 OUTLINED_FUNCTION: | |
| /// I2 --> BL OUTLINED_FUNCTION I1 | |
| /// I3 I2 | |
| /// I3 | |
| /// RET | |
| /// | |
| /// * Call construction overhead: 1 (BL) | |
| /// * Frame construction overhead: 1 (RET) | |
| /// * Requires stack fixups? No | |
| /// | |
| /// \p MachineOutlinerThunk implies that the function is being created from | |
| /// a sequence of instructions ending in a call. The outlined function is | |
| /// called with a BL instruction, and the outlined function tail-calls the | |
| /// original call destination. | |
| /// | |
| /// That is, | |
| /// | |
| /// I1 OUTLINED_FUNCTION: | |
| /// I2 --> BL OUTLINED_FUNCTION I1 | |
| /// BL f I2 | |
| /// B f | |
| /// * Call construction overhead: 1 (BL) | |
| /// * Frame construction overhead: 0 | |
| /// * Requires stack fixups? No | |
| /// | |
| /// \p MachineOutlinerRegSave implies that the function should be called with a | |
| /// save and restore of LR to an available register. This allows us to avoid | |
| /// stack fixups. Note that this outlining variant is compatible with the | |
| /// NoLRSave case. | |
| /// | |
| /// That is, | |
| /// | |
| /// I1 Save LR OUTLINED_FUNCTION: | |
| /// I2 --> BL OUTLINED_FUNCTION I1 | |
| /// I3 Restore LR I2 | |
| /// I3 | |
| /// RET | |
| /// | |
| /// * Call construction overhead: 3 (save + BL + restore) | |
| /// * Frame construction overhead: 1 (ret) | |
| /// * Requires stack fixups? No |
While the RISC-V backend has some outlining support, it's fairly limited compared to Arm and AArch64 backends, see
llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
Lines 2533 to 2570 in 89c95ef
| std::optional<outliner::OutlinedFunction> | |
| RISCVInstrInfo::getOutliningCandidateInfo( | |
| std::vector<outliner::Candidate> &RepeatedSequenceLocs) const { | |
| // First we need to filter out candidates where the X5 register (IE t0) can't | |
| // be used to setup the function call. | |
| auto CannotInsertCall = [](outliner::Candidate &C) { | |
| const TargetRegisterInfo *TRI = C.getMF()->getSubtarget().getRegisterInfo(); | |
| return !C.isAvailableAcrossAndOutOfSeq(RISCV::X5, *TRI); | |
| }; | |
| llvm::erase_if(RepeatedSequenceLocs, CannotInsertCall); | |
| // If the sequence doesn't have enough candidates left, then we're done. | |
| if (RepeatedSequenceLocs.size() < 2) | |
| return std::nullopt; | |
| unsigned SequenceSize = 0; | |
| for (auto &MI : RepeatedSequenceLocs[0]) | |
| SequenceSize += getInstSizeInBytes(MI); | |
| // call t0, function = 8 bytes. | |
| unsigned CallOverhead = 8; | |
| for (auto &C : RepeatedSequenceLocs) | |
| C.setCallInfo(MachineOutlinerDefault, CallOverhead); | |
| // jr t0 = 4 bytes, 2 bytes if compressed instructions are enabled. | |
| unsigned FrameOverhead = 4; | |
| if (RepeatedSequenceLocs[0] | |
| .getMF() | |
| ->getSubtarget<RISCVSubtarget>() | |
| .hasStdExtCOrZca()) | |
| FrameOverhead = 2; | |
| return outliner::OutlinedFunction(RepeatedSequenceLocs, SequenceSize, | |
| FrameOverhead, MachineOutlinerDefault); | |
| } |
We should improve the RISC-V Machine Outliner to be on par with Arm and AArch64.
Off the top of my head we should take the following steps:
- Perform Gap analysis between the ARM and RISC-V machine outliners
- Discuss the findings with RISC-V backend maintainers: @topperc @preames @MaskRay @asb @jrtc27
- Create a design implementation plan to improve support
- Implement improved support upstream
CC: @petrhosek
hiraditya and kupiakos
Metadata
Metadata
Assignees
Labels
backend:RISC-VenhancementImproving things as opposed to bug fixing, e.g. new or missing featureImproving things as opposed to bug fixing, e.g. new or missing featurellvm:codegenllvm:codesizeCode size issuesCode size issuesmetaissueIssue to collect references to a group of similar or related issues.Issue to collect references to a group of similar or related issues.