-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Description
This issue tracks remaining work to remove any redundant of suboptimally canonicalised instructions as found in the output of llvm-test-suite (including SPEC). None are known to be particularly high in dynamic instruction count, but as they can be removed/improved with relatively little effort and the generated code is unambiguously better, it seems worth stepping through. This scrappy script can be used for searching binaries in a directory (may be false positives). After spotting a few of these issues by eye, I thought it was probably worth being a bit more thorough (and additionally, some of them are helpful for anyone looking to minimise code size).
There are ~8k static instances in llvm-test-suite (note: hasn't been carefully checked for false positives).
Redundant operations
- No-op moves (
mv) left after MachineCopyPropagation - No-op moves in the form of xor/or/sub/sh*add[.uw]
- xor/or/sub addressed in [RISCV] Add OR/XOR/SUB to RISCVInstrInfo::isCopyInstrImpl #132002, sh*add addressed in [RISCV] Add shift-add (SH1ADD, ...) to isCopyInstrImpl #133443
- (TODO: link to strand of work from Mikhail, Philip and others on branches)
- Always-false or always true branches e.g.
bltu zero, t1, ...orbgeu zero, a2, ... - j to the next instruction (within the same function)
- j to the next instruction (falls through to first instruction of the next function)
- Deleting the jump would be correct, but may be more hassle with the linker than desirable for minimal gain.
- No-op
mv(i.e. with both operands equal)- Remaining cases may be due to lui/addi pairs where the addi immediate is resolved to 0 by the linker, the but the addi isn't removed.
Suboptimally canonicalised operations
- Reg-reg moves encoded as e.g.
sh1addwithrs1=zer0.- Produced after MachineCopyPropagation. Not compressible, while the plain
mvis. - Likely best solved by late stage canonicalisation. This could only be done when compression is enabled, but given it's never worse than neutral, prefer to keep codegen the same for compressed vs non-compressed.
- Produced after MachineCopyPropagation. Not compressible, while the plain
-
or rd, zero zero- Not compressible, want to use
c.li. Also, a number of instances of this in 502.gcc_r immediately followed by bnez on the loaded value..
- Not compressible, want to use
-
andiwithrs1==zero,zext.wofzero -
beq zero, rs2, ...andbne zero, rs2, ..- Should compress to
c.bnez/c.beqzand print an appropriate alias
- Should compress to
-
seqz/snezwithzerooeprand -
sll/srl/...withzerooperand forrs -
addw rd, zero, rt- Should be
addiwfor better compressibility
- Should be
- Other minor variants of the above