Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 20 additions & 4 deletions llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4796,8 +4796,22 @@ unsigned RISCV::getDestLog2EEW(const MCInstrDesc &Desc, unsigned Log2SEW) {
return Scaled;
}

/// Given two VL operands, do we know that LHS <= RHS?
bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS) {
static std::optional<int64_t> getEffectiveImm(const MachineOperand &MO,
const MachineRegisterInfo *MRI) {
assert(MO.isImm() || MO.getReg().isVirtual());
if (MO.isImm())
return MO.getImm();
const MachineInstr *Def = MRI->getVRegDef(MO.getReg());
int64_t Imm;
if (isLoadImm(Def, Imm))
return Imm;
return std::nullopt;
}

/// Given two VL operands, do we know that LHS <= RHS? Must be used in SSA form.
bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS,
const MachineRegisterInfo *MRI) {
assert(MRI->isSSA());
if (LHS.isReg() && RHS.isReg() && LHS.getReg().isVirtual() &&
LHS.getReg() == RHS.getReg())
return true;
Expand All @@ -4807,9 +4821,11 @@ bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS) {
return true;
if (LHS.isImm() && LHS.getImm() == RISCV::VLMaxSentinel)
return false;
if (!LHS.isImm() || !RHS.isImm())
std::optional<int64_t> LHSImm = getEffectiveImm(LHS, MRI),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the new logic works properly with VLMaxSentinel? In particular, what if you get an ADDI which happens to encode -1?

I think this is correct, just flagging it for extra consideration.

Copy link
Contributor Author

@lukel97 lukel97 Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally checked for VLMaxSentinel in ADDI, but undid this in 8072205. Since in the places where we check for VLMaxSentinel we don't seem to check for ADDIs, we only consider immediate operands. E.g. in RISCVInsertVSETVLI

    if (VLOp.isImm()) {
      int64_t Imm = VLOp.getImm();
      // Convert the VLMax sentintel to X0 register.
      if (Imm == RISCV::VLMaxSentinel) { ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, what if you get an ADDI which happens to encode -1?

I guess the question here is whether we ever generate ADDI with -1. Because if we do, then I think even for places that do not check for VLMaxSentinel might behave incorrectly. For instance, in VLOpt it will aggregate user VLs and pick the largest one. Without checking if the effective imm is VLMaxSentinel, ADDI with -1 will never be picked when it should be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, what if you get an ADDI which happens to encode -1?

I think we're okay here, and it would in fact be wrong to special case ADDI xN, x0, -1. The current immediate field is essentially a union of two states - actual immediate which are in the range of 0-31 and the VLMaxSentinel which represents the symbolic VLMAX value. We just happen to use the value -1 (which doesn't correspond to a valid immediate) for this purpose.

Now, if we did special case ADDI -1 as VLMaxSentinel, we'd probably never notice. -1 interpreted an unsigned XLEN is going to be way larger than any possible VLMAX, and given the rules of vsetvli would result in VL=VLMAX anyways. So the difference is probably not visible.

(In case it's not clear, I'm agreeing with Luke, just explaining my reasoning on how I got there.)

RHSImm = getEffectiveImm(RHS, MRI);
if (!LHSImm || !RHSImm)
return false;
return LHS.getImm() <= RHS.getImm();
return LHSImm <= RHSImm;
}

namespace {
Expand Down
3 changes: 2 additions & 1 deletion llvm/lib/Target/RISCV/RISCVInstrInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,8 @@ unsigned getDestLog2EEW(const MCInstrDesc &Desc, unsigned Log2SEW);
static constexpr int64_t VLMaxSentinel = -1LL;

/// Given two VL operands, do we know that LHS <= RHS?
bool isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS);
bool isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adds the assumption that this routine is only called in SSA form. Is that true? If so, update the comment.

const MachineRegisterInfo *MRI);

// Mask assignments for floating-point
static constexpr unsigned FPMASK_Negative_Infinity = 0x001;
Expand Down
10 changes: 5 additions & 5 deletions llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1379,7 +1379,7 @@ RISCVVLOptimizer::getMinimumVLForUser(const MachineOperand &UserOp) const {
assert(UserOp.getOperandNo() == UserMI.getNumExplicitDefs() &&
RISCVII::isFirstDefTiedToFirstUse(UserMI.getDesc()));
auto DemandedVL = DemandedVLs.lookup(&UserMI);
if (!DemandedVL || !RISCV::isVLKnownLE(*DemandedVL, VLOp)) {
if (!DemandedVL || !RISCV::isVLKnownLE(*DemandedVL, VLOp, MRI)) {
LLVM_DEBUG(dbgs() << " Abort because user is passthru in "
"instruction with demanded tail\n");
return std::nullopt;
Expand All @@ -1397,7 +1397,7 @@ RISCVVLOptimizer::getMinimumVLForUser(const MachineOperand &UserOp) const {
// requires.
if (auto DemandedVL = DemandedVLs.lookup(&UserMI)) {
assert(isCandidate(UserMI));
if (RISCV::isVLKnownLE(*DemandedVL, VLOp))
if (RISCV::isVLKnownLE(*DemandedVL, VLOp, MRI))
return DemandedVL;
}

Expand Down Expand Up @@ -1505,10 +1505,10 @@ RISCVVLOptimizer::checkUsers(const MachineInstr &MI) const {

// Use the largest VL among all the users. If we cannot determine this
// statically, then we cannot optimize the VL.
if (!CommonVL || RISCV::isVLKnownLE(*CommonVL, *VLOp)) {
if (!CommonVL || RISCV::isVLKnownLE(*CommonVL, *VLOp, MRI)) {
CommonVL = *VLOp;
LLVM_DEBUG(dbgs() << " User VL is: " << VLOp << "\n");
} else if (!RISCV::isVLKnownLE(*VLOp, *CommonVL)) {
} else if (!RISCV::isVLKnownLE(*VLOp, *CommonVL, MRI)) {
LLVM_DEBUG(dbgs() << " Abort because cannot determine a common VL\n");
return std::nullopt;
}
Expand Down Expand Up @@ -1570,7 +1570,7 @@ bool RISCVVLOptimizer::tryReduceVL(MachineInstr &MI) const {
CommonVL = VLMI->getOperand(RISCVII::getVLOpNum(VLMI->getDesc()));
}

if (!RISCV::isVLKnownLE(*CommonVL, VLOp)) {
if (!RISCV::isVLKnownLE(*CommonVL, VLOp, MRI)) {
LLVM_DEBUG(dbgs() << " Abort due to CommonVL not <= VLOp.\n");
return false;
}
Expand Down
16 changes: 8 additions & 8 deletions llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ bool RISCVVectorPeephole::tryToReduceVL(MachineInstr &MI) const {

MachineOperand &SrcVL =
Src->getOperand(RISCVII::getVLOpNum(Src->getDesc()));
if (VL.isIdenticalTo(SrcVL) || !RISCV::isVLKnownLE(VL, SrcVL))
if (VL.isIdenticalTo(SrcVL) || !RISCV::isVLKnownLE(VL, SrcVL, MRI))
continue;

if (!ensureDominates(VL, *Src))
Expand Down Expand Up @@ -440,7 +440,7 @@ bool RISCVVectorPeephole::convertSameMaskVMergeToVMv(MachineInstr &MI) {
const MachineOperand &MIVL = MI.getOperand(RISCVII::getVLOpNum(MI.getDesc()));
const MachineOperand &TrueVL =
True->getOperand(RISCVII::getVLOpNum(True->getDesc()));
if (!RISCV::isVLKnownLE(MIVL, TrueVL))
if (!RISCV::isVLKnownLE(MIVL, TrueVL, MRI))
return false;

// True's passthru needs to be equivalent to False
Expand Down Expand Up @@ -611,7 +611,7 @@ bool RISCVVectorPeephole::foldUndefPassthruVMV_V_V(MachineInstr &MI) {
MachineOperand &SrcPolicy =
Src->getOperand(RISCVII::getVecPolicyOpNum(Src->getDesc()));

if (RISCV::isVLKnownLE(MIVL, SrcVL))
if (RISCV::isVLKnownLE(MIVL, SrcVL, MRI))
SrcPolicy.setImm(SrcPolicy.getImm() | RISCVVType::TAIL_AGNOSTIC);
}

Expand Down Expand Up @@ -663,7 +663,7 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) {
// so we don't need to handle a smaller source VL here. However, the
// user's VL may be larger
MachineOperand &SrcVL = Src->getOperand(RISCVII::getVLOpNum(Src->getDesc()));
if (!RISCV::isVLKnownLE(SrcVL, MI.getOperand(3)))
if (!RISCV::isVLKnownLE(SrcVL, MI.getOperand(3), MRI))
return false;

// If the new passthru doesn't dominate Src, try to move Src so it does.
Expand All @@ -684,7 +684,7 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) {
// If MI was tail agnostic and the VL didn't increase, preserve it.
int64_t Policy = RISCVVType::TAIL_UNDISTURBED_MASK_UNDISTURBED;
if ((MI.getOperand(5).getImm() & RISCVVType::TAIL_AGNOSTIC) &&
RISCV::isVLKnownLE(MI.getOperand(3), SrcVL))
RISCV::isVLKnownLE(MI.getOperand(3), SrcVL, MRI))
Policy |= RISCVVType::TAIL_AGNOSTIC;
Src->getOperand(RISCVII::getVecPolicyOpNum(Src->getDesc())).setImm(Policy);
}
Expand Down Expand Up @@ -775,9 +775,9 @@ bool RISCVVectorPeephole::foldVMergeToMask(MachineInstr &MI) const {
True.getOperand(RISCVII::getVLOpNum(True.getDesc()));

MachineOperand MinVL = MachineOperand::CreateImm(0);
if (RISCV::isVLKnownLE(TrueVL, VMergeVL))
if (RISCV::isVLKnownLE(TrueVL, VMergeVL, MRI))
MinVL = TrueVL;
else if (RISCV::isVLKnownLE(VMergeVL, TrueVL))
else if (RISCV::isVLKnownLE(VMergeVL, TrueVL, MRI))
MinVL = VMergeVL;
else
return false;
Expand All @@ -797,7 +797,7 @@ bool RISCVVectorPeephole::foldVMergeToMask(MachineInstr &MI) const {
// to the tail. In that case we always need to use tail undisturbed to
// preserve them.
uint64_t Policy = RISCVVType::TAIL_UNDISTURBED_MASK_UNDISTURBED;
if (!PassthruReg && RISCV::isVLKnownLE(VMergeVL, MinVL))
if (!PassthruReg && RISCV::isVLKnownLE(VMergeVL, MinVL, MRI))
Policy |= RISCVVType::TAIL_AGNOSTIC;

assert(RISCVII::hasVecPolicyOp(True.getDesc().TSFlags) &&
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -413,9 +413,10 @@ define <256 x i8> @vadd_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
define <256 x i8> @vadd_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
; CHECK-LABEL: vadd_vi_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a0)
; CHECK-NEXT: li a0, 128
; CHECK-NEXT: vsetvli zero, a0, e8, m8, ta, ma
; CHECK-NEXT: vadd.vi v8, v8, -1, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -321,9 +321,10 @@ define <256 x i8> @vmax_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %e
define <256 x i8> @vmax_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
; CHECK-LABEL: vmax_vx_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a2, 128
; CHECK-NEXT: vsetvli zero, a2, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a1)
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vmax.vx v8, v8, a0, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -320,9 +320,10 @@ define <256 x i8> @vmaxu_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %
define <256 x i8> @vmaxu_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
; CHECK-LABEL: vmaxu_vx_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a2, 128
; CHECK-NEXT: vsetvli zero, a2, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a1)
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vmaxu.vx v8, v8, a0, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -321,9 +321,10 @@ define <256 x i8> @vmin_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %e
define <256 x i8> @vmin_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
; CHECK-LABEL: vmin_vx_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a2, 128
; CHECK-NEXT: vsetvli zero, a2, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a1)
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vmin.vx v8, v8, a0, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -320,9 +320,10 @@ define <256 x i8> @vminu_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %
define <256 x i8> @vminu_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
; CHECK-LABEL: vminu_vx_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a2, 128
; CHECK-NEXT: vsetvli zero, a2, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a1)
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vminu.vx v8, v8, a0, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -422,9 +422,10 @@ define <256 x i8> @vsadd_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
define <256 x i8> @vsadd_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
; CHECK-LABEL: vsadd_vi_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a0)
; CHECK-NEXT: li a0, 128
; CHECK-NEXT: vsetvli zero, a0, e8, m8, ta, ma
; CHECK-NEXT: vsadd.vi v8, v8, -1, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
5 changes: 3 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -418,9 +418,10 @@ define <256 x i8> @vsaddu_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
define <256 x i8> @vsaddu_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
; CHECK-LABEL: vsaddu_vi_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a0)
; CHECK-NEXT: li a0, 128
; CHECK-NEXT: vsetvli zero, a0, e8, m8, ta, ma
; CHECK-NEXT: vsaddu.vi v8, v8, -1, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
Expand Down
31 changes: 7 additions & 24 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -203,38 +203,21 @@ define <256 x i8> @select_v256i8(<256 x i1> %a, <256 x i8> %b, <256 x i8> %c, i3
define <256 x i8> @select_evl_v256i8(<256 x i1> %a, <256 x i8> %b, <256 x i8> %c) {
; CHECK-LABEL: select_evl_v256i8:
; CHECK: # %bb.0:
; CHECK-NEXT: addi sp, sp, -16
; CHECK-NEXT: .cfi_def_cfa_offset 16
; CHECK-NEXT: csrr a2, vlenb
; CHECK-NEXT: slli a2, a2, 3
; CHECK-NEXT: sub sp, sp, a2
; CHECK-NEXT: .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
; CHECK-NEXT: addi a2, sp, 16
; CHECK-NEXT: vs8r.v v16, (a2) # vscale x 64-byte Folded Spill
; CHECK-NEXT: vsetivli zero, 1, e8, m1, ta, ma
; CHECK-NEXT: vmv1r.v v7, v8
; CHECK-NEXT: vmv1r.v v6, v0
; CHECK-NEXT: li a2, 128
; CHECK-NEXT: addi a3, a1, 128
; CHECK-NEXT: vsetvli zero, a2, e8, m8, ta, ma
; CHECK-NEXT: vle8.v v24, (a0)
; CHECK-NEXT: addi a0, a1, 128
; CHECK-NEXT: vle8.v v8, (a0)
; CHECK-NEXT: vle8.v v16, (a1)
; CHECK-NEXT: vmv1r.v v6, v0
; CHECK-NEXT: vle8.v v24, (a3)
; CHECK-NEXT: vle8.v v8, (a1)
; CHECK-NEXT: vmv1r.v v0, v7
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vmerge.vvm v24, v8, v24, v0
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, mu
; CHECK-NEXT: vle8.v v24, (a0), v0.t
; CHECK-NEXT: vmv1r.v v0, v6
; CHECK-NEXT: addi a0, sp, 16
; CHECK-NEXT: vl8r.v v8, (a0) # vscale x 64-byte Folded Reload
; CHECK-NEXT: vsetvli zero, a2, e8, m8, ta, ma
; CHECK-NEXT: vmerge.vvm v8, v16, v8, v0
; CHECK-NEXT: vmerge.vvm v8, v8, v16, v0
; CHECK-NEXT: vmv8r.v v16, v24
; CHECK-NEXT: csrr a0, vlenb
; CHECK-NEXT: slli a0, a0, 3
; CHECK-NEXT: add sp, sp, a0
; CHECK-NEXT: .cfi_def_cfa sp, 16
; CHECK-NEXT: addi sp, sp, 16
; CHECK-NEXT: .cfi_def_cfa_offset 0
; CHECK-NEXT: ret
%v = call <256 x i8> @llvm.vp.select.v256i8(<256 x i1> %a, <256 x i8> %b, <256 x i8> %c, i32 129)
ret <256 x i8> %v
Expand Down
11 changes: 6 additions & 5 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -436,14 +436,15 @@ define <256 x i8> @vssub_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
define <256 x i8> @vssub_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
; CHECK-LABEL: vssub_vi_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a0)
; CHECK-NEXT: li a0, -1
; CHECK-NEXT: vssub.vx v8, v8, a0, v0.t
; CHECK-NEXT: li a0, 128
; CHECK-NEXT: li a1, -1
; CHECK-NEXT: vsetvli zero, a0, e8, m8, ta, ma
; CHECK-NEXT: vssub.vx v8, v8, a1, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vssub.vx v16, v16, a0, v0.t
; CHECK-NEXT: vssub.vx v16, v16, a1, v0.t
; CHECK-NEXT: ret
%v = call <256 x i8> @llvm.vp.ssub.sat.v258i8(<256 x i8> %va, <256 x i8> splat (i8 -1), <256 x i1> %m, i32 129)
ret <256 x i8> %v
Expand Down
11 changes: 6 additions & 5 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -431,14 +431,15 @@ define <256 x i8> @vssubu_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
define <256 x i8> @vssubu_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
; CHECK-LABEL: vssubu_vi_v258i8_evl129:
; CHECK: # %bb.0:
; CHECK-NEXT: li a1, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, ma
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vlm.v v24, (a0)
; CHECK-NEXT: li a0, -1
; CHECK-NEXT: vssubu.vx v8, v8, a0, v0.t
; CHECK-NEXT: li a0, 128
; CHECK-NEXT: li a1, -1
; CHECK-NEXT: vsetvli zero, a0, e8, m8, ta, ma
; CHECK-NEXT: vssubu.vx v8, v8, a1, v0.t
; CHECK-NEXT: vmv1r.v v0, v24
; CHECK-NEXT: vsetivli zero, 1, e8, m8, ta, ma
; CHECK-NEXT: vssubu.vx v16, v16, a0, v0.t
; CHECK-NEXT: vssubu.vx v16, v16, a1, v0.t
; CHECK-NEXT: ret
%v = call <256 x i8> @llvm.vp.usub.sat.v258i8(<256 x i8> %va, <256 x i8> splat (i8 -1), <256 x i1> %m, i32 129)
ret <256 x i8> %v
Expand Down
12 changes: 12 additions & 0 deletions llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,18 @@ define <vscale x 4 x i32> @diff_avl_vlmax(<vscale x 4 x i32> %passthru, <vscale
ret <vscale x 4 x i32> %w
}

define <vscale x 4 x i32> @diff_avl_non_uimm5(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
; CHECK-LABEL: diff_avl_non_uimm5:
; CHECK: # %bb.0:
; CHECK-NEXT: li a0, 42
; CHECK-NEXT: vsetvli zero, a0, e32, m2, tu, ma
; CHECK-NEXT: vadd.vv v8, v10, v12
; CHECK-NEXT: ret
%v = call <vscale x 4 x i32> @llvm.riscv.vadd.nxv4i32.nxv4i32(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b, iXLen 42)
%w = call <vscale x 4 x i32> @llvm.riscv.vmv.v.v.nxv4i32(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %v, iXLen 123)
ret <vscale x 4 x i32> %w
}

define <vscale x 4 x i32> @vadd_mask_ma(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i1> %mask, iXLen %vl) {
; CHECK-LABEL: vadd_mask_ma:
; CHECK: # %bb.0:
Expand Down