-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
| Bugzilla Link | 30950 |
| Version | trunk |
| OS | Linux |
| Reporter | LLVM Bugzilla Contributor |
| CC | @vns-mn,@hfinkel,@jmolloy,@rotateright |
Extended Description
r282453 | rsmith | 2016-09-26 16:49:47 -0700 (Mon, 26 Sep 2016) | 4 lines
P0145R3 (C++17 evaluation order tweaks): consistently emit the LHS of array
subscripting before the RHS, regardless of which is the base and which is the
index.
--------------------- 1.c -----------------------
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
long foo(long *maxarray, long k, long size) {
return MAX(maxarray[k], maxarray[k+size+1]);
}
For the testcase 1.c above, with and without r282453 llvm generates different code.
code generated by r282452:
.cfi_startproc
BB#0: # %entry
movq (%rdi,%rsi,8), %rcx
addq %rsi, %rdx
movq 8(%rdi,%rdx,8), %rax
cmpq %rax, %rcx
cmovgeq %rcx, %rax =====> select from the vals of two array accesses.
retq
.Lfunc_end0:
.size foo, .Lfunc_end0-foo
.cfi_endproc
code generated by r282453:
.cfi_startproc
BB#0: # %entry
movq (%rdi,%rsi,8), %rax
leaq (%rsi,%rdx), %rcx
leaq 1(%rsi,%rdx), %rdx
cmpq 8(%rdi,%rcx,8), %rax
cmovgq %rsi, %rdx ====> select from the indexes of two array accesses.
movq (%rdi,%rdx,8), %rax
retq
.Lfunc_end0:
.size foo, .Lfunc_end0-foo
.cfi_endproc
However, the cause of the change above is CFG simplify sinking. The algorithm to decide InstructionsToSink in CFG simplify sinking depends on the instruction sequence, i.e. during the reverse traversal of BB, once an unsinkable instruction is seen, the traversal will be stopped and remaining sinkable instruction will not be added into InstructionsToSink. That is why when layout is changed by r282453, CFG simplify sinking may generate different code.
How the different sinking causes the final code to be different:
Without r282453, CFG simplify sinking generates a select which will select from the addresses of two array accesses: %arrayidx and %arrayidx2.
define i64 @foo(i64* %maxarray, i64 %k, i64 %size) local_unnamed_addr #0 {
entry:
%arrayidx = getelementptr inbounds i64, i64* %maxarray, i64 %k
%0 = load i64, i64* %arrayidx, align 8, !tbaa !1
%add = add nsw i64 %k, %size
%add1 = add nsw i64 %add, 1
%arrayidx2 = getelementptr inbounds i64, i64* %maxarray, i64 %add1
%1 = load i64, i64* %arrayidx2, align 8, !tbaa !1
%cmp = icmp sgt i64 %0, %1
%arrayidx.arrayidx2 = select i1 %cmp, i64* %arrayidx, i64* %arrayidx2
%2 = load i64, i64* %arrayidx.arrayidx2, align 8, !tbaa !1
ret i64 %2
}
For the IR above, Instcombine can change load(select %arrayidx, %arrayidx2) to select(%0, %1) and we can remove the last load.
With r282453, CFG simplify sinking generates a select which will select from the indexes of two array accesses: %k, %add1.
define i64 @foo(i64* %maxarray, i64 %k, i64 %size) local_unnamed_addr #0 {
entry:
%arrayidx = getelementptr inbounds i64, i64* %maxarray, i64 %k
%0 = load i64, i64* %arrayidx, align 8, !tbaa !1
%add = add nsw i64 %k, %size
%add1 = add nsw i64 %add, 1
%arrayidx2 = getelementptr inbounds i64, i64* %maxarray, i64 %add1
%1 = load i64, i64* %arrayidx2, align 8, !tbaa !1
%cmp = icmp sgt i64 %0, %1
%k.add1 = select i1 %cmp, i64 %k, i64 %add1
%arrayidx6 = getelementptr inbounds i64, i64* %maxarray, i64 %k.add1
%2 = load i64, i64* %arrayidx6, align 8, !tbaa !1
ret i64 %2
}
For the IR above, Instcombine can not do the load(select) => select(load) transformation because the select may be deeply embedded inside of an array expr like load(...select...), so we cannot eliminate the last load.
In summary, two things to be addressed here:
- It is better for cfg simplify sinking to generate consistent code for unrelated changes.
- Either limit cfg simplify sinking to stop sinking at load address, or extend instcombine to handle the case load(...select(index1, index2)...) and transform it to select(load1, load2).