Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1449,11 +1449,14 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
return Register();
};

if (isFromLoadImm(MRI, LHS, C0) && MRI.hasOneUse(LHS.getReg())) {
if (isFromLoadImm(MRI, LHS, C0) && LHS.getReg().isVirtual() &&
MRI.hasOneUse(LHS.getReg())) {
assert(isInt<12>(C0) && "Unexpected immediate");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having the same question with Philip as where did we prevent sign overflow from happening, so perhaps we can leave a comment here?

// Might be case 1.
// Signed integer overflow is UB. (UINT64_MAX is bigger so we don't need
// to worry about unsigned overflow here)
if (C0 < INT64_MAX)
// Don't change 0 to -1 since we can use x0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems off? We'd convert it to +1 wouldn't we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Don't change 0 to -1 since we can use x0.
// Don't change 0 to 1 since we can use x0.

// For unsigned cases changing -1U to 0 would be incorrect.
if (C0 &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style wise, folding the C0 is non-zero check into the prior if-clause might make the code easier to read in both cases.

((CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT) || C0 != -1)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A || B) || C is just A || B || C isn't it?

if (Register RegZ = searchConst(C0 + 1)) {
reverseBranchCondition(Cond);
Cond[1] = MachineOperand::CreateReg(RHS.getReg(), /*isDef=*/false);
Expand All @@ -1464,11 +1467,14 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
modifyBranch();
return true;
}
} else if (isFromLoadImm(MRI, RHS, C0) && MRI.hasOneUse(RHS.getReg())) {
}
} else if (isFromLoadImm(MRI, RHS, C0) && RHS.getReg().isVirtual() &&
MRI.hasOneUse(RHS.getReg())) {
assert(isInt<12>(C0) && "Unexpected immediate");
// Might be case 2.
// For unsigned cases, we don't want C1 to wrap back to UINT64_MAX
// when C0 is zero.
if ((CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT) || C0)
// For signed cases we don't want to change 0 since we can use x0.
// For unsigned cases changing 0 to -1U would be incorrect.
if (C0) {
if (Register RegZ = searchConst(C0 - 1)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, staring at this, I'm questioning my sanity. Isn't this just completely unsound?

Given X <=u 22, this produces 21 >u X doesn't it?

Isn't that just complete wrong for X = 22?

Copy link
Collaborator Author

@topperc topperc Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can only be x < 22 or x >= 22 here. LT/GE are the only valid condition codes. Which will become 21 >= x or 21 < x.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is that precondition established?

Copy link
Collaborator Author

@topperc topperc Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RISC-V ISA. There are 6 branch instructions. EQ, NE, BLT, BGE, BLTU, BGEU.

EQ/NE are excluded earlier. This code handles the remaining 4.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, revised example.

BLT X, INT_MIN ==> BGE INT_MAX, X

The former is statically false. The later is statically true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protected by the assert(isInt<12>(C0)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! That was the bit I was missing. We can't have INT_MIN here until we handle LUI/ADDI sequences.

reverseBranchCondition(Cond);
Cond[1] = MachineOperand::CreateReg(RegZ, /*isDef=*/false);
Expand All @@ -1479,6 +1485,7 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
modifyBranch();
return true;
}
}
}

return false;
Expand Down
308 changes: 308 additions & 0 deletions llvm/test/CodeGen/RISCV/branch-opt.mir
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,74 @@
ret void
}

define void @ule_negone(ptr %a, i32 signext %b, ptr %c, ptr %d) {
store i32 0, ptr %a
%p = icmp ule i32 %b, -1
br i1 %p, label %block1, label %block2

block1: ; preds = %0
store i32 %b, ptr %c
br label %end_block

block2: ; preds = %0
store i32 87, ptr %d
br label %end_block

end_block: ; preds = %block2, %block1
ret void
}

define void @ult_zero(ptr %a, i32 signext %b, ptr %c, ptr %d) {
store i32 -1, ptr %a
%p = icmp ult i32 %b, 0
br i1 %p, label %block1, label %block2

block1: ; preds = %0
store i32 %b, ptr %c
br label %end_block

block2: ; preds = %0
store i32 87, ptr %d
br label %end_block

end_block: ; preds = %block2, %block1
ret void
}

define void @sle_zero(ptr %a, i32 signext %b, ptr %c, ptr %d) {
store i32 1, ptr %a
%p = icmp sle i32 %b, 0
br i1 %p, label %block1, label %block2

block1: ; preds = %0
store i32 %b, ptr %c
br label %end_block

block2: ; preds = %0
store i32 87, ptr %d
br label %end_block

end_block: ; preds = %block2, %block1
ret void
}

define void @slt_zero(ptr %a, i32 signext %b, ptr %c, ptr %d) {
store i32 -1, ptr %a
%p = icmp slt i32 %b, 0
br i1 %p, label %block1, label %block2

block1: ; preds = %0
store i32 %b, ptr %c
br label %end_block

block2: ; preds = %0
store i32 87, ptr %d
br label %end_block

end_block: ; preds = %block2, %block1
ret void
}

declare void @bar(...)

...
Expand Down Expand Up @@ -66,3 +134,243 @@ body: |
PseudoRET

...
---
name: ule_negone
tracksRegLiveness: true
body: |
; CHECK-LABEL: name: ule_negone
; CHECK: bb.0:
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
; CHECK-NEXT: liveins: $x10, $x11, $x12, $x13
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr = COPY $x13
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x12
; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr = COPY $x11
; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr = COPY $x10
; CHECK-NEXT: [[ADDI:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, -1
; CHECK-NEXT: BLTU killed [[ADDI1]], [[COPY2]], %bb.2
; CHECK-NEXT: PseudoBR %bb.1
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
; CHECK-NEXT: SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
; CHECK-NEXT: PseudoBR %bb.3
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
; CHECK-NEXT: SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.3:
; CHECK-NEXT: PseudoRET
bb.0:
successors: %bb.1, %bb.2
liveins: $x10, $x11, $x12, $x13

%3:gpr = COPY $x13
%2:gpr = COPY $x12
%1:gpr = COPY $x11
%0:gpr = COPY $x10
%5:gpr = ADDI $x0, 0
SW killed %5, %0, 0 :: (store (s32))
%6:gpr = ADDI $x0, -1
BLTU killed %6, %1, %bb.2
PseudoBR %bb.1

bb.1:
%4:gpr = COPY %1
SW %4, %2, 0 :: (store (s32))
PseudoBR %bb.3

bb.2:
%7:gpr = ADDI $x0, 87
SW killed %7, %3, 0 :: (store (s32))

bb.3:
PseudoRET
...
---
name: ult_zero
tracksRegLiveness: true
body: |
; CHECK-LABEL: name: ult_zero
; CHECK: bb.0:
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
; CHECK-NEXT: liveins: $x10, $x11, $x12, $x13
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr = COPY $x13
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x12
; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr = COPY $x11
; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr = COPY $x10
; CHECK-NEXT: [[ADDI:%[0-9]+]]:gpr = ADDI $x0, -1
; CHECK-NEXT: SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: BLTU [[COPY2]], killed [[ADDI1]], %bb.2
; CHECK-NEXT: PseudoBR %bb.1
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
; CHECK-NEXT: SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
; CHECK-NEXT: PseudoBR %bb.3
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
; CHECK-NEXT: SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.3:
; CHECK-NEXT: PseudoRET
bb.0:
successors: %bb.1, %bb.2
liveins: $x10, $x11, $x12, $x13

%3:gpr = COPY $x13
%2:gpr = COPY $x12
%1:gpr = COPY $x11
%0:gpr = COPY $x10
%5:gpr = ADDI $x0, -1
SW killed %5, %0, 0 :: (store (s32))
%6:gpr = ADDI $x0, 0
BLTU %1, killed %6, %bb.2
PseudoBR %bb.1

bb.1:
%4:gpr = COPY %1
SW %4, %2, 0 :: (store (s32))
PseudoBR %bb.3

bb.2:
%7:gpr = ADDI $x0, 87
SW killed %7, %3, 0 :: (store (s32))

bb.3:
PseudoRET
...
---
name: sle_zero
tracksRegLiveness: true
body: |
; CHECK-LABEL: name: sle_zero
; CHECK: bb.0:
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
; CHECK-NEXT: liveins: $x10, $x11, $x12, $x13
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr = COPY $x13
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x12
; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr = COPY $x11
; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr = COPY $x10
; CHECK-NEXT: [[ADDI:%[0-9]+]]:gpr = ADDI $x0, 1
; CHECK-NEXT: SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: BLT killed [[ADDI1]], [[COPY2]], %bb.2
; CHECK-NEXT: PseudoBR %bb.1
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
; CHECK-NEXT: SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
; CHECK-NEXT: PseudoBR %bb.3
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
; CHECK-NEXT: SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.3:
; CHECK-NEXT: PseudoRET
bb.0:
successors: %bb.1, %bb.2
liveins: $x10, $x11, $x12, $x13

%3:gpr = COPY $x13
%2:gpr = COPY $x12
%1:gpr = COPY $x11
%0:gpr = COPY $x10
%5:gpr = ADDI $x0, 1
SW killed %5, %0, 0 :: (store (s32))
%6:gpr = ADDI $x0, 0
BLT killed %6, %1, %bb.2
PseudoBR %bb.1

bb.1:
%4:gpr = COPY %1
SW %4, %2, 0 :: (store (s32))
PseudoBR %bb.3

bb.2:
%7:gpr = ADDI $x0, 87
SW killed %7, %3, 0 :: (store (s32))

bb.3:
PseudoRET
...
---
name: slt_zero
tracksRegLiveness: true
body: |
; CHECK-LABEL: name: slt_zero
; CHECK: bb.0:
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
; CHECK-NEXT: liveins: $x10, $x11, $x12, $x13
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr = COPY $x13
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x12
; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr = COPY $x11
; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr = COPY $x10
; CHECK-NEXT: [[ADDI:%[0-9]+]]:gpr = ADDI $x0, -1
; CHECK-NEXT: SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
; CHECK-NEXT: [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
; CHECK-NEXT: BLT [[COPY2]], killed [[ADDI1]], %bb.2
; CHECK-NEXT: PseudoBR %bb.1
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
; CHECK-NEXT: SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
; CHECK-NEXT: PseudoBR %bb.3
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2:
; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
; CHECK-NEXT: SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.3:
; CHECK-NEXT: PseudoRET
bb.0:
successors: %bb.1, %bb.2
liveins: $x10, $x11, $x12, $x13

%3:gpr = COPY $x13
%2:gpr = COPY $x12
%1:gpr = COPY $x11
%0:gpr = COPY $x10
%5:gpr = ADDI $x0, -1
SW killed %5, %0, 0 :: (store (s32))
%6:gpr = ADDI $x0, 0
BLT %1, killed %6, %bb.2
PseudoBR %bb.1

bb.1:
%4:gpr = COPY %1
SW %4, %2, 0 :: (store (s32))
PseudoBR %bb.3

bb.2:
%7:gpr = ADDI $x0, 87
SW killed %7, %3, 0 :: (store (s32))

bb.3:
PseudoRET
...
Loading