Skip to content
3 changes: 2 additions & 1 deletion llvm/lib/CodeGen/CodeGenPrepare.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5790,7 +5790,8 @@ static BasicBlock::iterator findInsertPos(Value *Addr, Instruction *MemoryInst,
// instruction after it.
if (SunkAddr) {
if (Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr))
return std::next(AddrInst->getIterator());
return AddrInst->isTerminator() ? MemoryInst->getIterator()
: std::next(AddrInst->getIterator());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found Invoke instruction at this position is already wrong. The problem is at

      if (!ResultIndex) {
        SunkAddr = ResultPtr;
      } else {
  1. We should make sure ResultPtr is in current BB before directly using it as SunkAddr.
  2. If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.
  3. Invoke instruction is not a good candidate for SunkAddr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure ResultPtr is in current BB before directly using it as SunkAddr.

We use global pointers safely because we can't get iterator in a BB for them. Do we want to handle them similarly to instructions in other blocks?

If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.

Clone -- do you mean create GEP with 0?

Invoke instruction is not a good candidate for SunkAddr.

Not obvious to me, why is it different from instructions in other blocks or from globals?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @weiguozhi . From looking at the test case, why are we trying to sink the GEP upwards toward the entry block? It seems like something has gone wrong earlier.

The proposed fix is like adding a null check when the correct fix was to avoid passing null at a higher level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed fix is like adding a null check when the correct fix was to avoid passing null at a higher level.

Yeah, I agree but the same we can say about checking on Instruction to bypass arguments and globals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of optimizeMemoryInst is to find address computation in PrevBB, and the address is used in a memory access instruction in a SuccBB, it then duplicate the address computation in SuccBB, SelectionDAG can fold the address computation into memory access.

We should make sure ResultPtr is in current BB before directly using it as SunkAddr.

We use global pointers safely because we can't get iterator in a BB for them. Do we want to handle them similarly to instructions in other blocks?

Global pointers is absolutely OK in a memory access instruction, but it doesn't fit into this optimization.

If ResultPtr is not in current BB we can clone it as the SunkAddr in current BB.

Clone -- do you mean create GEP with 0?

Duplicate the address computation instruction. (thus SunkAddr)

Invoke instruction is not a good candidate for SunkAddr.

Not obvious to me, why is it different from instructions in other blocks or from globals?

We can't fold Invoke instruction into a memory access instruction.

}

// Find the first user of Addr in current BB.
Expand Down
35 changes: 35 additions & 0 deletions llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,38 @@ bb3:
bb7:
ret void
}

; Test a scenario when the address is the last instruction in the basic block.

define void @adress_terminator() personality ptr null {
; CHECK-LABEL: define void @adress_terminator() personality ptr null {
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: [[PTR:%.*]] = invoke ptr null(i64 0)
; CHECK-NEXT: to label %[[BODY_1:.*]] unwind label %[[EHCLEANUP:.*]]
; CHECK: [[EHCLEANUP]]:
; CHECK-NEXT: [[TMP0:%.*]] = cleanuppad within none []
; CHECK-NEXT: cleanupret from [[TMP0]] unwind to caller
; CHECK: [[BODY_1]]:
; CHECK-NEXT: [[TMP1:%.*]] = bitcast ptr [[PTR]] to ptr
; CHECK-NEXT: store <4 x i32> zeroinitializer, ptr [[TMP1]], align 4
; CHECK-NEXT: [[V0:%.*]] = load <4 x i32>, ptr [[PTR]], align 4
; CHECK-NEXT: store <4 x i32> zeroinitializer, ptr [[PTR]], align 4
; CHECK-NEXT: ret void
;
entry:
%ptr = invoke ptr null(i64 0) to label %body.1 unwind label %ehcleanup

body.2:
%v0 = load <4 x i32>, ptr %2, align 4
store <4 x i32> zeroinitializer, ptr %2, align 4
ret void

ehcleanup:
%1 = cleanuppad within none []
cleanupret from %1 unwind to caller

body.1:
%2 = getelementptr { i32 }, ptr %ptr, i64 0, i32 0
store <4 x i32> zeroinitializer, ptr %2, align 4
br label %body.2
}
Loading