-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Closed
Labels
Description
This is a performance issue.
For the following code:
int compute1(int);
int compute2(int);
int test();
int branch(int check1, int check2, int check3, int check4, int in) {
int doadd = 1;
if (check1) {
in = compute1(in);
doadd = 1; // We do the add.
} else if (check2) {
in = compute2(in);
doadd = test(); // Unknown if we should do the add.
} else if (check3) {
in *= 7;
doadd = 0; // Do not do the add.
}
if (doadd)
in += 5;
// Do a multiply by 7 here to confuse the compiler.
if (check4) {
in *= 7;
}
return in;
}
We can compile for RISCV:
clang -O3 -S branch.c -target riscv64-arc-linux-gnu
In the final assembly we end up with basic blocks that look like this:
< ... more code up here ... >
64 .LBB0_8: # %if.end9
65 slli a0, a4, 3
66 subw a0, a0, a4
67 bnez a2, .LBB0_11
68 # %bb.9: # %if.end9
69 bnez a2, .LBB0_2
70 j .LBB0_12
71 .LBB0_10:
72 mv a0, s1
73 j .LBB0_2
74 .LBB0_11: # %if.end9
75 mv a4, a0
76 bnez a2, .LBB0_2
77 .LBB0_12:
78 addiw a0, a4, 5
79 j .LBB0_2
In the above we generate an unconditional jump (line 70) to LBB0_12 and then from there we jump right back out to LBB0_2 after a single instruction. I feel like LBB0_12 should be duplicated into line 70. This would save a jump and only at the cost of a slight code size increase.
There should be a way to clean up situations where we have:
jump A
<...>
A:
<very few instructions, maybe 1 - 3>
jump B
I'm happy to provide a patch for this but I don't know how to approach the issue because machine block placement runs very late. Should this be an improvement in machine block placement?