[RISCV] Add tune info for postra scheduling direction #115864

wangpc-pp · 2024-11-12T12:55:27Z

The results differ on different platforms so it is really hard to
determine a common default value.

Tune info for postra scheduling direction is added and CPUs can
set their own preferable postra scheduling direction.

llvmbot · 2024-11-12T12:56:28Z

@llvm/pr-subscribers-tablegen
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-globalisel

Author: Pengcheng Wang (wangpc-pp)

Changes

This helps improve scheduling result (more bubbles can be filled).

There are two commits in this PR:

Enable PostRA scheduling by default. This is just for testing.
Enable bidirectional scheduling.

Compared to topdown: 1011 files changed, 60562 insertions(+), 60682 deletions(-)

Patch is 15.68 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115864.diff

1029 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVSubtarget.cpp (+9)
(modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+7-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv32.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv64.ll (+8-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb-zbkb.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll (+33-33)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbkb.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+56-56)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/shift.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+362-362)
(modified) llvm/test/CodeGen/RISCV/abds-neg.ll (+214-222)
(modified) llvm/test/CodeGen/RISCV/abds.ll (+253-263)
(modified) llvm/test/CodeGen/RISCV/abdu-neg.ll (+170-170)
(modified) llvm/test/CodeGen/RISCV/abdu.ll (+157-157)
(modified) llvm/test/CodeGen/RISCV/add-before-shl.ll (+14-14)
(modified) llvm/test/CodeGen/RISCV/add-imm.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/addc-adde-sube-subc.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/addcarry.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/addimm-mulimm.ll (+34-34)
(modified) llvm/test/CodeGen/RISCV/aext-to-sext.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/alloca.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/alu16.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/alu32.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/alu64.ll (+25-27)
(modified) llvm/test/CodeGen/RISCV/alu8.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/and-add-lsr.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/and.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/arith-with-overflow.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll (+21-21)
(modified) llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll (+464-482)
(modified) llvm/test/CodeGen/RISCV/atomic-rmw-discard.ll (+60-60)
(modified) llvm/test/CodeGen/RISCV/atomic-rmw-sub.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/atomic-rmw.ll (+3175-3175)
(modified) llvm/test/CodeGen/RISCV/atomic-signext.ll (+686-686)
(modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+226-226)
(modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+242-242)
(modified) llvm/test/CodeGen/RISCV/avgceils.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/avgceilu.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/avgfloors.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/avgflooru.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/bf16-promote.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/bfloat-arith.ll (+31-31)
(modified) llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll (+26-26)
(modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+133-133)
(modified) llvm/test/CodeGen/RISCV/bfloat-fcmp.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/bfloat-frem.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/bfloat-imm.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/bfloat-mem.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/bfloat-select-fcmp.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/bfloat.ll (+53-53)
(modified) llvm/test/CodeGen/RISCV/bitextract-mac.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/bittest.ll (+91-91)
(modified) llvm/test/CodeGen/RISCV/blockaddress.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/branch-on-zero.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+255-255)
(modified) llvm/test/CodeGen/RISCV/bswap-bitreverse.ll (+51-51)
(modified) llvm/test/CodeGen/RISCV/byval.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/callee-saved-fpr32s.ll (+397-397)
(modified) llvm/test/CodeGen/RISCV/callee-saved-fpr64s.ll (+234-234)
(modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+709-709)
(modified) llvm/test/CodeGen/RISCV/calling-conv-half.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32-ilp32f-common.ll (+34-34)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32-ilp32f-ilp32d-common.ll (+176-176)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32d.ll (+24-24)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+319-319)
(modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32f-ilp32d-common.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-common.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll (+63-63)
(modified) llvm/test/CodeGen/RISCV/calling-conv-lp64.ll (+33-33)
(modified) llvm/test/CodeGen/RISCV/calling-conv-lp64e.ll (+35-35)
(modified) llvm/test/CodeGen/RISCV/calling-conv-rv32f-ilp32.ll (+14-15)
(modified) llvm/test/CodeGen/RISCV/calling-conv-rv32f-ilp32e.ll (+13-14)
(modified) llvm/test/CodeGen/RISCV/calling-conv-sext-zext.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/calling-conv-vector-float.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/calling-conv-vector-on-stack.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/calls.ll (+126-126)
(modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+28-28)
(modified) llvm/test/CodeGen/RISCV/cmov-branch-opt.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/codemodel-lowering.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/compress-opt-select.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/condbinops.ll (+25-25)
(modified) llvm/test/CodeGen/RISCV/condops.ll (+278-284)
(modified) llvm/test/CodeGen/RISCV/convert-highly-predictable-select-to-branch.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/copyprop.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/copysign-casts.ll (+62-62)
(modified) llvm/test/CodeGen/RISCV/ctlz-cttz-ctpop.ll (+118-120)
(modified) llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll (+100-100)
(modified) llvm/test/CodeGen/RISCV/div-by-constant.ll (+53-53)
(modified) llvm/test/CodeGen/RISCV/div-pow2.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/div.ll (+58-58)
(modified) llvm/test/CodeGen/RISCV/double-arith-strict.ll (+120-120)
(modified) llvm/test/CodeGen/RISCV/double-arith.ll (+240-240)
(modified) llvm/test/CodeGen/RISCV/double-bitmanip-dagcombines.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/double-br-fcmp.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/double-calling-conv.ll (+33-33)
(modified) llvm/test/CodeGen/RISCV/double-convert-strict.ll (+22-22)
(modified) llvm/test/CodeGen/RISCV/double-convert.ll (+344-344)
(modified) llvm/test/CodeGen/RISCV/double-fcmp-strict.ll (+173-185)
(modified) llvm/test/CodeGen/RISCV/double-fcmp.ll (+80-80)
(modified) llvm/test/CodeGen/RISCV/double-imm.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/double-intrinsics-strict.ll (+45-45)
(modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+55-55)
(modified) llvm/test/CodeGen/RISCV/double-maximum-minimum.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/double-mem.ll (+23-23)
(modified) llvm/test/CodeGen/RISCV/double-previous-failure.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/double-round-conv-sat.ll (+180-180)
(modified) llvm/test/CodeGen/RISCV/double-round-conv.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/double-select-fcmp.ll (+28-28)
(modified) llvm/test/CodeGen/RISCV/double-select-icmp.ll (+20-20)
(modified) llvm/test/CodeGen/RISCV/double-stack-spill-restore.ll (+20-20)
(modified) llvm/test/CodeGen/RISCV/double-zfa.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/double_reduct.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/exception-pointer-register.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/fastcc-bf16.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/fastcc-float.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/fastcc-half.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/fastcc-int.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/fastcc-without-f-reg.ll (+621-621)
(modified) llvm/test/CodeGen/RISCV/fli-licm.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/float-arith-strict.ll (+70-70)
(modified) llvm/test/CodeGen/RISCV/float-arith.ll (+142-142)
(modified) llvm/test/CodeGen/RISCV/float-bit-preserving-dagcombines.ll (+66-66)
(modified) llvm/test/CodeGen/RISCV/float-bitmanip-dagcombines.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/float-br-fcmp.ll (+20-20)
(modified) llvm/test/CodeGen/RISCV/float-convert-strict.ll (+22-22)
(modified) llvm/test/CodeGen/RISCV/float-convert.ll (+274-274)
(modified) llvm/test/CodeGen/RISCV/float-fcmp-strict.ll (+131-137)
(modified) llvm/test/CodeGen/RISCV/float-fcmp.ll (+62-62)
(modified) llvm/test/CodeGen/RISCV/float-intrinsics-strict.ll (+28-28)
(modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+70-70)
(modified) llvm/test/CodeGen/RISCV/float-maximum-minimum.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/float-mem.ll (+17-17)
(modified) llvm/test/CodeGen/RISCV/float-round-conv-sat.ll (+222-222)
(modified) llvm/test/CodeGen/RISCV/float-round-conv.ll (+40-40)
(modified) llvm/test/CodeGen/RISCV/float-select-fcmp.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/float-zfa.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/fmax-fmin.ll (+56-56)
(modified) llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/fold-binop-into-select.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/forced-atomics.ll (+725-725)
(modified) llvm/test/CodeGen/RISCV/fp-fcanonicalize.ll (+327-327)
(modified) llvm/test/CodeGen/RISCV/fp128.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/fp16-promote.ll (+21-21)
(modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+228-228)
(modified) llvm/test/CodeGen/RISCV/fpenv.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/frame-info.ll (+34-34)
(modified) llvm/test/CodeGen/RISCV/frame.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/frameaddr-returnaddr.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/get-setcc-result-type.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/global-merge-minsize.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/global-merge-offset.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/global-merge.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/half-arith-strict.ll (+34-34)
(modified) llvm/test/CodeGen/RISCV/half-arith.ll (+446-446)
(modified) llvm/test/CodeGen/RISCV/half-bitmanip-dagcombines.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/half-br-fcmp.ll (+68-68)
(modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+40-40)
(modified) llvm/test/CodeGen/RISCV/half-convert.ll (+730-730)
(modified) llvm/test/CodeGen/RISCV/half-fcmp-strict.ll (+56-62)
(modified) llvm/test/CodeGen/RISCV/half-fcmp.ll (+74-74)
(modified) llvm/test/CodeGen/RISCV/half-frem.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/half-imm.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/half-intrinsics.ll (+322-322)
(modified) llvm/test/CodeGen/RISCV/half-maximum-minimum.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/half-mem.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/half-round-conv-sat.ll (+504-504)
(modified) llvm/test/CodeGen/RISCV/half-round-conv.ll (+225-225)
(modified) llvm/test/CodeGen/RISCV/half-select-fcmp.ll (+43-43)
(modified) llvm/test/CodeGen/RISCV/half-zfa.ll (+17-17)
(modified) llvm/test/CodeGen/RISCV/hoist-global-addr-base.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/i64-icmp.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/iabs.ll (+36-36)
(modified) llvm/test/CodeGen/RISCV/imm-cse.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/imm.ll (+104-104)
(modified) llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/inline-asm-d-modifier-N.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/inline-asm-f-constraint-f.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/inline-asm-f-modifier-N.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/inline-asm-zfh-constraint-f.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/inline-asm-zfh-modifier-N.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/inline-asm.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/interrupt-attr-callee.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/interrupt-attr-nocall.ll (+345-345)
(modified) llvm/test/CodeGen/RISCV/interrupt-attr.ll (+2698-2698)
(modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/jumptable-swguarded.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/jumptable.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/lack-of-signed-truncation-check.ll (+14-14)
(modified) llvm/test/CodeGen/RISCV/large-stack.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/legalize-fneg.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/libcall-tail-calls.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/llvm.exp10.ll (+190-190)
(modified) llvm/test/CodeGen/RISCV/llvm.frexp.ll (+374-382)
(modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/loop-strength-reduce-add-cheaper-than-mul.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/loop-strength-reduce-loop-invar.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/lpad.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/lsr-legaladdimm.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/machine-combiner-strategies.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/machine-combiner.ll (+78-78)
(modified) llvm/test/CodeGen/RISCV/machine-cse.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/machine-outliner-and-machine-copy-propagation.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/machine-outliner-throw.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/machine-sink-load-immediate.ll (+15-15)
(modified) llvm/test/CodeGen/RISCV/machinelicm-address-pseudos.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/machinelicm-constant-phys-reg.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/macro-fusion-lui-addi.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/mem.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/mem64.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/memcmp-optsize.ll (+127-127)
(modified) llvm/test/CodeGen/RISCV/memcmp.ll (+143-143)
(modified) llvm/test/CodeGen/RISCV/memcpy.ll (+38-38)
(modified) llvm/test/CodeGen/RISCV/memset-inline.ll (+656-656)
(modified) llvm/test/CodeGen/RISCV/min-max.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/miss-sp-restore-eh.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/mul.ll (+108-108)
(modified) llvm/test/CodeGen/RISCV/naked-fn-with-frame-pointer.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/narrow-shl-cst.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/neg-abs.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/nomerge.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+610-610)
(modified) llvm/test/CodeGen/RISCV/or-is-add.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/orc-b-patterns.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/out-of-reach-emergency-slot.mir (+2-2)
(modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+93-98)
(modified) llvm/test/CodeGen/RISCV/pr51206.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/pr56110.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr56457.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr58511.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/pr63816.ll (+31-31)
(modified) llvm/test/CodeGen/RISCV/pr64645.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr65025.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/pr68855.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr69586.ll (+95-95)
(modified) llvm/test/CodeGen/RISCV/pr84200.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr84653_pr85190.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/pr90652.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr94145.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr95271.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/pr95284.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/pr96366.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+894-894)
(modified) llvm/test/CodeGen/RISCV/reduce-unnecessary-extension.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/redundant-copy-from-tail-duplicate.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rem.ll (+34-34)
(modified) llvm/test/CodeGen/RISCV/remat.ll (+50-50)
(modified) llvm/test/CodeGen/RISCV/repeated-fp-divisors.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/riscv-codegenprepare-asm.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/riscv-shifted-extend.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rotl-rotr.ll (+270-270)
(modified) llvm/test/CodeGen/RISCV/rv32e.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv32i-rv64i-float-double.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rv32i-rv64i-half.ll (+14-14)
(modified) llvm/test/CodeGen/RISCV/rv32xtheadba.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv32xtheadbb.ll (+28-29)
(modified) llvm/test/CodeGen/RISCV/rv32xtheadbs.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rv32zba.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rv32zbb-zbkb.ll (+21-24)
(modified) llvm/test/CodeGen/RISCV/rv32zbb.ll (+70-70)
(modified) llvm/test/CodeGen/RISCV/rv32zbkb.ll (+11-11)
(modified) llvm/test/CodeGen/RISCV/rv32zbs.ll (+19-19)
(modified) llvm/test/CodeGen/RISCV/rv64-double-convert.ll (+57-57)
(modified) llvm/test/CodeGen/RISCV/rv64-float-convert.ll (+55-55)
(modified) llvm/test/CodeGen/RISCV/rv64-half-convert.ll (+52-52)
(modified) llvm/test/CodeGen/RISCV/rv64-patchpoint.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rv64-statepoint-call-lowering.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rv64-trampoline.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/rv64e.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv64f-float-convert.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv64i-complex-float.ll (+10-10)
(modified) llvm/test/CodeGen/RISCV/rv64i-demanded-bits.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64i-shift-sext.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64i-w-insts-legalization.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64xtheadbb.ll (+33-33)
(modified) llvm/test/CodeGen/RISCV/rv64zba.ll (+43-43)
(modified) llvm/test/CodeGen/RISCV/rv64zbb-zbkb.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64zbb.ll (+50-50)
(modified) llvm/test/CodeGen/RISCV/rv64zbc-intrinsic.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64zbc-zbkc-intrinsic.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64zbkb.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rv64zfh-half-convert.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv64zfhmin-half-convert.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv-cfi-info.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/65704-illegal-instruction.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/access-fixed-objects-by-rvv.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/active_lane_mask.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll (+1-1)

diff --git a/llvm/lib/Target/RISCV/RISCVSubtarget.cpp b/llvm/lib/Target/RISCV/RISCVSubtarget.cpp
index e7db1ededf383b..3fb756b0fab170 100644
--- a/llvm/lib/Target/RISCV/RISCVSubtarget.cpp
+++ b/llvm/lib/Target/RISCV/RISCVSubtarget.cpp
@@ -16,6 +16,7 @@
 #include "RISCV.h"
 #include "RISCVFrameLowering.h"
 #include "RISCVTargetMachine.h"
+#include "llvm/CodeGen/MachineScheduler.h"
 #include "llvm/CodeGen/MacroFusion.h"
 #include "llvm/CodeGen/ScheduleDAGMutation.h"
 #include "llvm/MC/TargetRegistry.h"
@@ -199,3 +200,11 @@ unsigned RISCVSubtarget::getMinimumJumpTableEntries() const {
              ? RISCVMinimumJumpTableEntries
              : TuneInfo->MinimumJumpTableEntries;
 }
+
+void RISCVSubtarget::overridePostRASchedPolicy(MachineSchedPolicy &Policy,
+                                               unsigned NumRegionInstrs) const {
+  // Do bidirectional scheduling since it provides a more balanced scheduling
+  // leading to better performance. This will increase compile time.
+  Policy.OnlyTopDown = false;
+  Policy.OnlyBottomUp = false;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVSubtarget.h b/llvm/lib/Target/RISCV/RISCVSubtarget.h
index f59a3737ae76f9..5d1d64f5694243 100644
--- a/llvm/lib/Target/RISCV/RISCVSubtarget.h
+++ b/llvm/lib/Target/RISCV/RISCVSubtarget.h
@@ -124,7 +124,10 @@ class RISCVSubtarget : public RISCVGenSubtargetInfo {
   }
   bool enableMachineScheduler() const override { return true; }
 
-  bool enablePostRAScheduler() const override { return UsePostRAScheduler; }
+  bool enablePostRAScheduler() const override {
+    // FIXNE: Just for tests, will revert this change when landing.
+    return true;
+  }
 
   Align getPrefFunctionAlignment() const {
     return Align(TuneInfo->PrefFunctionAlignment);
@@ -327,6 +330,9 @@ class RISCVSubtarget : public RISCVGenSubtargetInfo {
   unsigned getTailDupAggressiveThreshold() const {
     return TuneInfo->TailDupAggressiveThreshold;
   }
+
+  void overridePostRASchedPolicy(MachineSchedPolicy &Policy,
+                                 unsigned NumRegionInstrs) const override;
 };
 } // End llvm namespace
 
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll b/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
index 330f8b16065f13..45eb3478eef739 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
@@ -25,18 +25,18 @@ define i32 @add_i8_signext_i32(i8 %a, i8 %b) {
 ; RV32IM-LABEL: add_i8_signext_i32:
 ; RV32IM:       # %bb.0: # %entry
 ; RV32IM-NEXT:    slli a0, a0, 24
-; RV32IM-NEXT:    srai a0, a0, 24
 ; RV32IM-NEXT:    slli a1, a1, 24
 ; RV32IM-NEXT:    srai a1, a1, 24
+; RV32IM-NEXT:    srai a0, a0, 24
 ; RV32IM-NEXT:    add a0, a0, a1
 ; RV32IM-NEXT:    ret
 ;
 ; RV64IM-LABEL: add_i8_signext_i32:
 ; RV64IM:       # %bb.0: # %entry
 ; RV64IM-NEXT:    slli a0, a0, 56
-; RV64IM-NEXT:    srai a0, a0, 56
 ; RV64IM-NEXT:    slli a1, a1, 56
 ; RV64IM-NEXT:    srai a1, a1, 56
+; RV64IM-NEXT:    srai a0, a0, 56
 ; RV64IM-NEXT:    add a0, a0, a1
 ; RV64IM-NEXT:    ret
 entry:
@@ -49,15 +49,15 @@ entry:
 define i32 @add_i8_zeroext_i32(i8 %a, i8 %b) {
 ; RV32IM-LABEL: add_i8_zeroext_i32:
 ; RV32IM:       # %bb.0: # %entry
-; RV32IM-NEXT:    andi a0, a0, 255
 ; RV32IM-NEXT:    andi a1, a1, 255
+; RV32IM-NEXT:    andi a0, a0, 255
 ; RV32IM-NEXT:    add a0, a0, a1
 ; RV32IM-NEXT:    ret
 ;
 ; RV64IM-LABEL: add_i8_zeroext_i32:
 ; RV64IM:       # %bb.0: # %entry
-; RV64IM-NEXT:    andi a0, a0, 255
 ; RV64IM-NEXT:    andi a1, a1, 255
+; RV64IM-NEXT:    andi a0, a0, 255
 ; RV64IM-NEXT:    add a0, a0, a1
 ; RV64IM-NEXT:    ret
 entry:
@@ -404,8 +404,8 @@ define i64 @add_i64(i64 %a, i64 %b) {
 ; RV32IM-LABEL: add_i64:
 ; RV32IM:       # %bb.0: # %entry
 ; RV32IM-NEXT:    add a0, a0, a2
-; RV32IM-NEXT:    sltu a2, a0, a2
 ; RV32IM-NEXT:    add a1, a1, a3
+; RV32IM-NEXT:    sltu a2, a0, a2
 ; RV32IM-NEXT:    add a1, a1, a2
 ; RV32IM-NEXT:    ret
 ;
@@ -439,8 +439,8 @@ define i64 @sub_i64(i64 %a, i64 %b) {
 ; RV32IM-LABEL: sub_i64:
 ; RV32IM:       # %bb.0: # %entry
 ; RV32IM-NEXT:    sub a4, a0, a2
-; RV32IM-NEXT:    sltu a0, a0, a2
 ; RV32IM-NEXT:    sub a1, a1, a3
+; RV32IM-NEXT:    sltu a0, a0, a2
 ; RV32IM-NEXT:    sub a1, a1, a0
 ; RV32IM-NEXT:    mv a0, a4
 ; RV32IM-NEXT:    ret
@@ -460,8 +460,8 @@ define i64 @subi_i64(i64 %a) {
 ; RV32IM-NEXT:    lui a2, 1048275
 ; RV32IM-NEXT:    addi a2, a2, -1548
 ; RV32IM-NEXT:    add a0, a0, a2
-; RV32IM-NEXT:    sltu a2, a0, a2
 ; RV32IM-NEXT:    addi a1, a1, -1
+; RV32IM-NEXT:    sltu a2, a0, a2
 ; RV32IM-NEXT:    add a1, a1, a2
 ; RV32IM-NEXT:    ret
 ;
@@ -480,8 +480,8 @@ define i64 @neg_i64(i64 %a) {
 ; RV32IM-LABEL: neg_i64:
 ; RV32IM:       # %bb.0: # %entry
 ; RV32IM-NEXT:    neg a2, a0
-; RV32IM-NEXT:    snez a0, a0
 ; RV32IM-NEXT:    neg a1, a1
+; RV32IM-NEXT:    snez a0, a0
 ; RV32IM-NEXT:    sub a1, a1, a0
 ; RV32IM-NEXT:    mv a0, a2
 ; RV32IM-NEXT:    ret
@@ -500,8 +500,8 @@ entry:
 define i64 @and_i64(i64 %a, i64 %b) {
 ; RV32IM-LABEL: and_i64:
 ; RV32IM:       # %bb.0: # %entry
-; RV32IM-NEXT:    and a0, a0, a2
 ; RV32IM-NEXT:    and a1, a1, a3
+; RV32IM-NEXT:    and a0, a0, a2
 ; RV32IM-NEXT:    ret
 ;
 ; RV64IM-LABEL: and_i64:
@@ -516,8 +516,8 @@ entry:
 define i64 @andi_i64(i64 %a) {
 ; RV32IM-LABEL: andi_i64:
 ; RV32IM:       # %bb.0: # %entry
-; RV32IM-NEXT:    andi a0, a0, 1234
 ; RV32IM-NEXT:    li a1, 0
+; RV32IM-NEXT:    andi a0, a0, 1234
 ; RV32IM-NEXT:    ret
 ;
 ; RV64IM-LABEL: andi_i64:
@@ -532,8 +532,8 @@ entry:
 define i64 @or_i64(i64 %a, i64 %b) {
 ; RV32IM-LABEL: or_i64:
 ; RV32IM:       # %bb.0: # %entry
-; RV32IM-NEXT:    or a0, a0, a2
 ; RV32IM-NEXT:    or a1, a1, a3
+; RV32IM-NEXT:    or a0, a0, a2
 ; RV32IM-NEXT:    ret
 ;
 ; RV64IM-LABEL: or_i64:
@@ -563,8 +563,8 @@ entry:
 define i64 @xor_i64(i64 %a, i64 %b) {
 ; RV32IM-LABEL: xor_i64:
 ; RV32IM:       # %bb.0: # %entry
-; RV32IM-NEXT:    xor a0, a0, a2
 ; RV32IM-NEXT:    xor a1, a1, a3
+; RV32IM-NEXT:    xor a0, a0, a2
 ; RV32IM-NEXT:    ret
 ;
 ; RV64IM-LABEL: xor_i64:
@@ -597,8 +597,8 @@ define i64 @mul_i64(i64 %a, i64 %b) {
 ; RV32IM-NEXT:    mul a4, a0, a2
 ; RV32IM-NEXT:    mul a1, a1, a2
 ; RV32IM-NEXT:    mul a3, a0, a3
-; RV32IM-NEXT:    mulhu a0, a0, a2
 ; RV32IM-NEXT:    add a1, a1, a3
+; RV32IM-NEXT:    mulhu a0, a0, a2
 ; RV32IM-NEXT:    add a1, a1, a0
 ; RV32IM-NEXT:    mv a0, a4
 ; RV32IM-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll b/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
index f33ba1d7a302ef..acd32cff21cad3 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
@@ -6,18 +6,18 @@ define i2 @bitreverse_i2(i2 %x) {
 ; RV32-LABEL: bitreverse_i2:
 ; RV32:       # %bb.0:
 ; RV32-NEXT:    slli a1, a0, 1
-; RV32-NEXT:    andi a1, a1, 2
 ; RV32-NEXT:    andi a0, a0, 3
 ; RV32-NEXT:    srli a0, a0, 1
+; RV32-NEXT:    andi a1, a1, 2
 ; RV32-NEXT:    or a0, a1, a0
 ; RV32-NEXT:    ret
 ;
 ; RV64-LABEL: bitreverse_i2:
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a1, a0, 1
-; RV64-NEXT:    andi a1, a1, 2
 ; RV64-NEXT:    andi a0, a0, 3
 ; RV64-NEXT:    srli a0, a0, 1
+; RV64-NEXT:    andi a1, a1, 2
 ; RV64-NEXT:    or a0, a1, a0
 ; RV64-NEXT:    ret
   %rev = call i2 @llvm.bitreverse.i2(i2 %x)
@@ -31,8 +31,8 @@ define i3 @bitreverse_i3(i3 %x) {
 ; RV32-NEXT:    andi a1, a1, 4
 ; RV32-NEXT:    andi a0, a0, 7
 ; RV32-NEXT:    andi a2, a0, 2
-; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    srli a0, a0, 2
+; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    or a0, a1, a0
 ; RV32-NEXT:    ret
 ;
@@ -42,8 +42,8 @@ define i3 @bitreverse_i3(i3 %x) {
 ; RV64-NEXT:    andi a1, a1, 4
 ; RV64-NEXT:    andi a0, a0, 7
 ; RV64-NEXT:    andi a2, a0, 2
-; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    srli a0, a0, 2
+; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    or a0, a1, a0
 ; RV64-NEXT:    ret
   %rev = call i3 @llvm.bitreverse.i3(i3 %x)
@@ -61,8 +61,8 @@ define i4 @bitreverse_i4(i4 %x) {
 ; RV32-NEXT:    andi a0, a0, 15
 ; RV32-NEXT:    srli a2, a0, 1
 ; RV32-NEXT:    andi a2, a2, 2
-; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    srli a0, a0, 3
+; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    or a0, a1, a0
 ; RV32-NEXT:    ret
 ;
@@ -76,8 +76,8 @@ define i4 @bitreverse_i4(i4 %x) {
 ; RV64-NEXT:    andi a0, a0, 15
 ; RV64-NEXT:    srli a2, a0, 1
 ; RV64-NEXT:    andi a2, a2, 2
-; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    srli a0, a0, 3
+; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    or a0, a1, a0
 ; RV64-NEXT:    ret
   %rev = call i4 @llvm.bitreverse.i4(i4 %x)
@@ -103,8 +103,8 @@ define i7 @bitreverse_i7(i7 %x) {
 ; RV32-NEXT:    srli a3, a0, 4
 ; RV32-NEXT:    andi a3, a3, 2
 ; RV32-NEXT:    or a2, a2, a3
-; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    srli a0, a0, 6
+; RV32-NEXT:    or a1, a1, a2
 ; RV32-NEXT:    or a0, a1, a0
 ; RV32-NEXT:    ret
 ;
@@ -126,8 +126,8 @@ define i7 @bitreverse_i7(i7 %x) {
 ; RV64-NEXT:    srli a3, a0, 4
 ; RV64-NEXT:    andi a3, a3, 2
 ; RV64-NEXT:    or a2, a2, a3
-; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    srli a0, a0, 6
+; RV64-NEXT:    or a1, a1, a2
 ; RV64-NEXT:    or a0, a1, a0
 ; RV64-NEXT:    ret
   %rev = call i7 @llvm.bitreverse.i7(i7 %x)
@@ -163,9 +163,9 @@ define i24 @bitreverse_i24(i24 %x) {
 ; RV32-NEXT:    addi a1, a1, -1366
 ; RV32-NEXT:    and a2, a1, a2
 ; RV32-NEXT:    and a2, a0, a2
-; RV32-NEXT:    srli a2, a2, 1
 ; RV32-NEXT:    slli a0, a0, 1
 ; RV32-NEXT:    and a0, a0, a1
+; RV32-NEXT:    srli a2, a2, 1
 ; RV32-NEXT:    or a0, a2, a0
 ; RV32-NEXT:    ret
 ;
@@ -197,9 +197,9 @@ define i24 @bitreverse_i24(i24 %x) {
 ; RV64-NEXT:    addiw a1, a1, -1366
 ; RV64-NEXT:    and a2, a1, a2
 ; RV64-NEXT:    and a2, a0, a2
-; RV64-NEXT:    srli a2, a2, 1
 ; RV64-NEXT:    slli a0, a0, 1
 ; RV64-NEXT:    and a0, a0, a1
+; RV64-NEXT:    srli a2, a2, 1
 ; RV64-NEXT:    or a0, a2, a0
 ; RV64-NEXT:    ret
   %rev = call i24 @llvm.bitreverse.i24(i24 %x)
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
index 70d1b25309c844..9bea20efb3eccd 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv32.ll
@@ -46,10 +46,10 @@ define void @constant_fold_barrier_i128(ptr %p) {
 ; RV32-NEXT:    or a1, a4, a1
 ; RV32-NEXT:    add a5, a5, zero
 ; RV32-NEXT:    add a1, a5, a1
-; RV32-NEXT:    sw a2, 0(a0)
-; RV32-NEXT:    sw a6, 4(a0)
-; RV32-NEXT:    sw a3, 8(a0)
 ; RV32-NEXT:    sw a1, 12(a0)
+; RV32-NEXT:    sw a3, 8(a0)
+; RV32-NEXT:    sw a6, 4(a0)
+; RV32-NEXT:    sw a2, 0(a0)
 ; RV32-NEXT:    ret
 entry:
   %x = load i128, ptr %p
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
index 51e8b6da39d099..be4ade025b413f 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/constbarrier-rv64.ll
@@ -25,8 +25,8 @@ define i128 @constant_fold_barrier_i128(i128 %x) {
 ; RV64-NEXT:    and a0, a0, a2
 ; RV64-NEXT:    and a1, a1, zero
 ; RV64-NEXT:    add a0, a0, a2
-; RV64-NEXT:    sltu a2, a0, a2
 ; RV64-NEXT:    add a1, a1, zero
+; RV64-NEXT:    sltu a2, a0, a2
 ; RV64-NEXT:    add a1, a1, a2
 ; RV64-NEXT:    ret
 entry:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll b/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
index a4f92640697bc7..33ac5cc5a07443 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
@@ -43,8 +43,8 @@ define i32 @fcvt_wu_d(double %a) nounwind {
 define i32 @fcvt_wu_d_multiple_use(double %x, ptr %y) nounwind {
 ; RV32IFD-LABEL: fcvt_wu_d_multiple_use:
 ; RV32IFD:       # %bb.0:
-; RV32IFD-NEXT:    fcvt.wu.d a1, fa0, rtz
 ; RV32IFD-NEXT:    li a0, 1
+; RV32IFD-NEXT:    fcvt.wu.d a1, fa0, rtz
 ; RV32IFD-NEXT:    beqz a1, .LBB4_2
 ; RV32IFD-NEXT:  # %bb.1:
 ; RV32IFD-NEXT:    mv a0, a1
@@ -156,8 +156,8 @@ define i64 @fmv_x_d(double %a, double %b) nounwind {
 ; RV32IFD-NEXT:    addi sp, sp, -16
 ; RV32IFD-NEXT:    fadd.d fa5, fa0, fa1
 ; RV32IFD-NEXT:    fsd fa5, 8(sp)
-; RV32IFD-NEXT:    lw a0, 8(sp)
 ; RV32IFD-NEXT:    lw a1, 12(sp)
+; RV32IFD-NEXT:    lw a0, 8(sp)
 ; RV32IFD-NEXT:    addi sp, sp, 16
 ; RV32IFD-NEXT:    ret
 ;
@@ -214,8 +214,8 @@ define double @fmv_d_x(i64 %a, i64 %b) nounwind {
 ; RV32IFD-NEXT:    sw a0, 8(sp)
 ; RV32IFD-NEXT:    sw a1, 12(sp)
 ; RV32IFD-NEXT:    fld fa5, 8(sp)
-; RV32IFD-NEXT:    sw a2, 8(sp)
 ; RV32IFD-NEXT:    sw a3, 12(sp)
+; RV32IFD-NEXT:    sw a2, 8(sp)
 ; RV32IFD-NEXT:    fld fa4, 8(sp)
 ; RV32IFD-NEXT:    fadd.d fa0, fa5, fa4
 ; RV32IFD-NEXT:    addi sp, sp, 16
@@ -223,8 +223,8 @@ define double @fmv_d_x(i64 %a, i64 %b) nounwind {
 ;
 ; RV64IFD-LABEL: fmv_d_x:
 ; RV64IFD:       # %bb.0:
-; RV64IFD-NEXT:    fmv.d.x fa5, a0
 ; RV64IFD-NEXT:    fmv.d.x fa4, a1
+; RV64IFD-NEXT:    fmv.d.x fa5, a0
 ; RV64IFD-NEXT:    fadd.d fa0, fa5, fa4
 ; RV64IFD-NEXT:    ret
   %1 = bitcast i64 %a to double
@@ -330,17 +330,17 @@ define signext i16 @fcvt_w_s_i16(double %a) nounwind {
 define zeroext i16 @fcvt_wu_s_i16(double %a) nounwind {
 ; RV32IFD-LABEL: fcvt_wu_s_i16:
 ; RV32IFD:       # %bb.0:
-; RV32IFD-NEXT:    fcvt.wu.d a0, fa0, rtz
 ; RV32IFD-NEXT:    lui a1, 16
 ; RV32IFD-NEXT:    addi a1, a1, -1
+; RV32IFD-NEXT:    fcvt.wu.d a0, fa0, rtz
 ; RV32IFD-NEXT:    and a0, a0, a1
 ; RV32IFD-NEXT:    ret
 ;
 ; RV64IFD-LABEL: fcvt_wu_s_i16:
 ; RV64IFD:       # %bb.0:
-; RV64IFD-NEXT:    fcvt.wu.d a0, fa0, rtz
 ; RV64IFD-NEXT:    lui a1, 16
 ; RV64IFD-NEXT:    addiw a1, a1, -1
+; RV64IFD-NEXT:    fcvt.wu.d a0, fa0, rtz
 ; RV64IFD-NEXT:    and a0, a0, a1
 ; RV64IFD-NEXT:    ret
   %1 = fptoui double %a to i16
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll b/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
index 7e96d529af36ff..6ccef58d488108 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
@@ -27,8 +27,8 @@ define i32 @fcvt_wu_s(float %a) nounwind {
 define i32 @fcvt_wu_s_multiple_use(float %x, ptr %y) nounwind {
 ; RV32IF-LABEL: fcvt_wu_s_multiple_use:
 ; RV32IF:       # %bb.0:
-; RV32IF-NEXT:    fcvt.wu.s a1, fa0, rtz
 ; RV32IF-NEXT:    li a0, 1
+; RV32IF-NEXT:    fcvt.wu.s a1, fa0, rtz
 ; RV32IF-NEXT:    beqz a1, .LBB2_2
 ; RV32IF-NEXT:  # %bb.1:
 ; RV32IF-NEXT:    mv a0, a1
@@ -120,8 +120,8 @@ define float @fcvt_s_wu_load(ptr %p) nounwind {
 define float @fmv_w_x(i32 %a, i32 %b) nounwind {
 ; CHECKIF-LABEL: fmv_w_x:
 ; CHECKIF:       # %bb.0:
-; CHECKIF-NEXT:    fmv.w.x fa5, a0
 ; CHECKIF-NEXT:    fmv.w.x fa4, a1
+; CHECKIF-NEXT:    fmv.w.x fa5, a0
 ; CHECKIF-NEXT:    fadd.s fa0, fa5, fa4
 ; CHECKIF-NEXT:    ret
 ; Ensure fmv.w.x is generated even for a soft float calling convention
@@ -302,17 +302,17 @@ define signext i16 @fcvt_w_s_i16(float %a) nounwind {
 define zeroext i16 @fcvt_wu_s_i16(float %a) nounwind {
 ; RV32IF-LABEL: fcvt_wu_s_i16:
 ; RV32IF:       # %bb.0:
-; RV32IF-NEXT:    fcvt.wu.s a0, fa0, rtz
 ; RV32IF-NEXT:    lui a1, 16
 ; RV32IF-NEXT:    addi a1, a1, -1
+; RV32IF-NEXT:    fcvt.wu.s a0, fa0, rtz
 ; RV32IF-NEXT:    and a0, a0, a1
 ; RV32IF-NEXT:    ret
 ;
 ; RV64IF-LABEL: fcvt_wu_s_i16:
 ; RV64IF:       # %bb.0:
-; RV64IF-NEXT:    fcvt.wu.s a0, fa0, rtz
 ; RV64IF-NEXT:    lui a1, 16
 ; RV64IF-NEXT:    addiw a1, a1, -1
+; RV64IF-NEXT:    fcvt.wu.s a0, fa0, rtz
 ; RV64IF-NEXT:    and a0, a0, a1
 ; RV64IF-NEXT:    ret
   %1 = fptoui float %a to i16
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv32.ll b/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv32.ll
index 1757e5550f81ae..250e8edafa836f 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv32.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv32.ll
@@ -9,8 +9,8 @@
 define float @fadd(float %x, float %y) {
 ; RV32I-LABEL: fadd:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    fmv.w.x fa5, a0
 ; RV32I-NEXT:    fmv.w.x fa4, a1
+; RV32I-NEXT:    fmv.w.x fa5, a0
 ; RV32I-NEXT:    fadd.s fa5, fa5, fa4
 ; RV32I-NEXT:    fmv.x.w a0, fa5
 ; RV32I-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv64.ll b/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv64.ll
index 287bbbad6d52d7..717ecac7300b1b 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv64.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/fpr-gpr-copy-rv64.ll
@@ -9,8 +9,8 @@
 define double @fadd_f64(double %x, double %y) {
 ; RV64I-LABEL: fadd_f64:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    fmv.d.x fa5, a0
 ; RV64I-NEXT:    fmv.d.x fa4, a1
+; RV64I-NEXT:    fmv.d.x fa5, a0
 ; RV64I-NEXT:    fadd.d fa5, fa5, fa4
 ; RV64I-NEXT:    fmv.x.d a0, fa5
 ; RV64I-NEXT:    ret
@@ -30,6 +30,13 @@ define float @fadd_f32(float %x, float %y) {
 ; RV32I-NEXT:    fadd.d fa5, fa5, fa4
 ; RV32I-NEXT:    fmv.x.d a0, fa5
 ; RV32I-NEXT:    ret
+; RV64I-LABEL: fadd_f32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    fmv.w.x fa4, a1
+; RV64I-NEXT:    fmv.w.x fa5, a0
+; RV64I-NEXT:    fadd.s fa5, fa5, fa4
+; RV64I-NEXT:    fmv.x.w a0, fa5
+; RV64I-NEXT:    ret
   %a = fadd float %x, %y
   ret float %a
 }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll b/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll
index 05989c310541b8..82540a3976f357 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/iabs.ll
@@ -120,8 +120,8 @@ define i64 @abs64(i64 %x) {
 ; RV32I-NEXT:    sltu a3, a0, a2
 ; RV32I-NEXT:    add a1, a1, a2
 ; RV32I-NEXT:    add a1, a1, a3
-; RV32I-NEXT:    xor a0, a0, a2
 ; RV32I-NEXT:    xor a1, a1, a2
+; RV32I-NEXT:    xor a0, a0, a2
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBB-LABEL: abs64:
@@ -131,8 +131,8 @@ define i64 @abs64(i64 %x) {
 ; RV32ZBB-NEXT:    sltu a3, a0, a2
 ; RV32ZBB-NEXT:    add a1, a1, a2
 ; RV32ZBB-NEXT:    add a1, a1, a3
-; RV32ZBB-NEXT:    xor a0, a0, a2
 ; RV32ZBB-NEXT:    xor a1, a1, a2
+; RV32ZBB-NEXT:    xor a0, a0, a2
 ; RV32ZBB-NEXT:    ret
 ;
 ; RV64I-LABEL: abs64:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll b/llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll
index 9dda1a241e042b..018c135cc8626c 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll
@@ -15,13 +15,13 @@
 define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ; RV32I-SMALL-LABEL: above_threshold:
 ; RV32I-SMALL:       # %bb.0: # %entry
-; RV32I-SMALL-NEXT:    li a2, 5
 ; RV32I-SMALL-NEXT:    addi a0, a0, -1
+; RV32I-SMALL-NEXT:    li a2, 5
 ; RV32I-SMALL-NEXT:    bltu a2, a0, .LBB0_9
 ; RV32I-SMALL-NEXT:  # %bb.1: # %entry
 ; RV32I-SMALL-NEXT:    lui a2, %hi(.LJTI0_0)
-; RV32I-SMALL-NEXT:    addi a2, a2, %lo(.LJTI0_0)
 ; RV32I-SMALL-NEXT:    slli a0, a0, 2
+; RV32I-SMALL-NEXT:    addi a2, a2, %lo(.LJTI0_0)
 ; RV32I-SMALL-NEXT:    add a0, a2, a0
 ; RV32I-SMALL-NEXT:    lw a0, 0(a0)
 ; RV32I-SMALL-NEXT:    jr a0
@@ -49,14 +49,14 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ;
 ; RV32I-MEDIUM-LABEL: above_threshold:
 ; RV32I-MEDIUM:       # %bb.0: # %entry
-; RV32I-MEDIUM-NEXT:    li a2, 5
 ; RV32I-MEDIUM-NEXT:    addi a0, a0, -1
+; RV32I-MEDIUM-NEXT:    li a2, 5
 ; RV32I-MEDIUM-NEXT:    bltu a2, a0, .LBB0_9
 ; RV32I-MEDIUM-NEXT:  # %bb.1: # %entry
 ; RV32I-MEDIUM-NEXT:  .Lpcrel_hi0:
 ; RV32I-MEDIUM-NEXT:    auipc a2, %pcrel_hi(.LJTI0_0)
-; RV32I-MEDIUM-NEXT:    addi a2, a2, %pcrel_lo(.Lpcrel_hi0)
 ; RV32I-MEDIUM-NEXT:    slli a0, a0, 2
+; RV32I-MEDIUM-NEXT:    addi a2, a2, %pcrel_lo(.Lpcrel_hi0)
 ; RV32I-MEDIUM-NEXT:    add a0, a2, a0
 ; RV32I-MEDIUM-NEXT:    lw a0, 0(a0)
 ; RV32I-MEDIUM-NEXT:    jr a0
@@ -84,14 +84,14 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ;
 ; RV32I-PIC-LABEL: above_threshold:
 ; RV32I-PIC:       # %bb.0: # %entry
-; RV32I-PIC-NEXT:    li a2, 5
 ; RV32I-PIC-NEXT:    addi a0, a0, -1
+; RV32I-PIC-NEXT:    li a2, 5
 ; RV32I-PIC-NEXT:    bltu a2, a0, .LBB0_9
 ; RV32I-PIC-NEXT:  # %bb.1: # %entry
 ; RV32I-PIC-NEXT:  .Lpcrel_hi0:
 ; RV32I-PIC-NEXT:    auipc a2, %pcrel_hi(.LJTI0_0)
-; RV32I-PIC-NEXT:    addi a2, a2, %pcrel_lo(.Lpcrel_hi0)
 ; RV32I-PIC-NEXT:    slli a0, a0, 2
+; RV32I-PIC-NEXT:    addi a2, a2, %pcrel_lo(.Lpcrel_hi0)
 ; RV32I-PIC-NEXT:    add a0, a2, a0
 ; RV32I-PIC-NEXT:    lw a0, 0(a0)
 ; RV32I-PIC-NEXT:    add a0, a0, a2
@@ -120,13 +120,13 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ;
 ; RV64I-SMALL-LABEL: above_threshold:
 ; RV64...
[truncated]

preames

At least for me, the diff here is too large to meaningfully review. I think we need an alternate form of justification that this is profitable. Can you present either some performance numbers or at minimum some kind of statistic that demonstrates value?

For context, I'm not particularly skeptical of the patch - it seems to make sense - the diffs are just way too big to be meaningfully skimmed.

michaelmaitland · 2024-11-12T17:18:26Z

How are we evaluating this change? On spills? On impact to dynamic IC? On impact to runtime on real hardware?

wangpc-pp · 2024-11-13T03:41:14Z

How are we evaluating this change? On spills? On impact to dynamic IC? On impact to runtime on real hardware?

IIUC, PostRA scheduling basically won't impact on spills and dynamic instruction count. I will show some cycles/IPC based on GEM5.
I do agree this should be tuned by CPUs, I will make it a tuning feature later but we still need a default setting (topdown, bottomup or bidirectional, this can be debated).

wangpc-pp · 2024-11-15T09:48:27Z

I got some results on different platforms.

Coremark with 60000 iterations on GEM5: -O3 -march=rv64gc (use this configuration so there are more instructions to schedule and GEM5 has a better support)

RiscvO3CPU

	Baseline	TopDown	BottomUp	Bidirectional
Cycles	33140248616	33031782179	33173323700	33063777692
IPC	1.381871	1.386409	1.380822	1.383095

RiscvAtomicSimpleCPU

	Baseline	TopDown	BottomUp	Bidirectional
Cycles	30563561357	30935303916	30898754397	30563777692
IPC	0.879774	0.869409	0.870437	0.879768

Coremark on Spacemit-X60 (-O3 -mcpu=spacemit-x60):

	Baseline	TopDown	BottomUp	Bidirectional
Iterations/Sec	4763.71	4993.34	4730.50	4853.86

The results differ on different platforms so it is really hard to determine a default value. I will make it a target feature and leave the default value to be Bidirectional (this can be debated).

mshockwave · 2024-11-15T21:42:06Z

I got some results on different platforms.

Coremark with 60000 iterations on GEM5: -O3 -march=rv64gc (use this configuration so there are more instructions to schedule and GEM5 has a better support)

RiscvO3CPU

Baseline TopDown BottomUp Bidirectional
Cycles 33140248616 33031782179 33173323700 33063777692
IPC 1.381871 1.386409 1.380822 1.383095

RiscvAtomicSimpleCPU

Baseline TopDown BottomUp Bidirectional
Cycles 30563561357 30935303916 30898754397 30563777692
IPC 0.879774 0.869409 0.870437 0.879768
2. Coremark on Spacemit-X60 (-O3 -mcpu=spacemit-x60):

Baseline TopDown BottomUp Bidirectional
Iterations/Sec 4763.71 4993.34 4730.50 4853.86
The results differ on different platforms so it is really hard to determine a default value. I will make it a target feature and leave the default value to be Bidirectional (this can be debated).

I have no objection making it into a tuning feature.

llvm/lib/Target/RISCV/RISCVFeatures.td

topperc · 2024-11-19T06:10:06Z

Let me make sure I understand. BottomUp and bi-directional RA scheduling were added last year by @michaelmaitland 9106b58. I don't think we ended up enabling inside SiFive. Have any other targets adopted it yet?

wangpc-pp · 2024-11-19T06:20:50Z

Let me make sure I understand. BottomUp and bi-directional RA scheduling were added last year by @michaelmaitland 9106b58. I don't think we ended up enabling inside SiFive. Have any other targets adopted it yet?

Not yet. And I just found some issues when enabling postra bidirectional scheduling: #116592 and #116584.
Or we can set it to topdown by default, almost all current data show that it has the best result and it is the previous default direction.

topperc · 2024-12-10T17:12:00Z

Member

Let's go with top down by default since that's what other target use.

The results differ on different platforms so it is really hard to determine a common default value. Tune info for postra scheduling direction is added and CPUs can set their own preferable postra scheduling direction.

wangpc-pp · 2024-12-13T04:24:34Z

Member

Let's go with top down by default since that's what other target use.

Done.

michaelmaitland

LGTM

topperc

LGTM

llvmbot added backend:RISC-V llvm:globalisel labels Nov 12, 2024

wangpc-pp requested review from asb, lukel97, michaelmaitland, mshockwave, preames and topperc November 12, 2024 12:58

preames reviewed Nov 12, 2024

View reviewed changes

wangpc-pp force-pushed the main-riscv-override-postra-sched-policy branch from f618de6 to 4237baf Compare November 18, 2024 09:50

llvmbot added the tablegen label Nov 18, 2024

wangpc-pp mentioned this pull request Nov 18, 2024

[TableGen] Add an option to SubtargetFeature to disable setting the field to the maximum value #116594

Closed

topperc reviewed Nov 18, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVFeatures.td Outdated Show resolved Hide resolved

wangpc-pp force-pushed the main-riscv-override-postra-sched-policy branch from 4237baf to ae40c24 Compare November 19, 2024 05:21

wangpc-pp changed the title ~~[RISCV] Enable bidirectional postra scheduling~~ [RISCV] Add tune info for postra scheduling direction Nov 19, 2024

wangpc-pp force-pushed the main-riscv-override-postra-sched-policy branch from ae40c24 to c3841ef Compare December 10, 2024 07:53

[RISCV] Add tune info for postra scheduling direction

4989eb0

The results differ on different platforms so it is really hard to determine a common default value. Tune info for postra scheduling direction is added and CPUs can set their own preferable postra scheduling direction.

wangpc-pp force-pushed the main-riscv-override-postra-sched-policy branch from c3841ef to 4989eb0 Compare December 13, 2024 04:24

michaelmaitland approved these changes Dec 13, 2024

View reviewed changes

topperc approved these changes Dec 13, 2024

View reviewed changes

wangpc-pp merged commit 9571d20 into llvm:main Dec 16, 2024
8 checks passed

wangpc-pp deleted the main-riscv-override-postra-sched-policy branch December 16, 2024 04:19

[RISCV] Add tune info for postra scheduling direction #115864

[RISCV] Add tune info for postra scheduling direction #115864

Uh oh!

Conversation

wangpc-pp commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

michaelmaitland commented Nov 12, 2024

Uh oh!

wangpc-pp commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangpc-pp commented Nov 15, 2024

Uh oh!

mshockwave commented Nov 15, 2024

Uh oh!

Uh oh!

topperc commented Nov 19, 2024

Uh oh!

wangpc-pp commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc commented Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangpc-pp commented Dec 13, 2024

Uh oh!

michaelmaitland left a comment

Choose a reason for hiding this comment

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

wangpc-pp commented Nov 12, 2024 •

edited

Loading

llvmbot commented Nov 12, 2024 •

edited

Loading

wangpc-pp commented Nov 13, 2024 •

edited

Loading

wangpc-pp commented Nov 19, 2024 •

edited

Loading

topperc commented Dec 10, 2024 •

edited

Loading