Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2518,7 +2518,12 @@ bool SelectionDAG::expandMultipleResultFPLibCall(
SDValue StoreValue = ST->getValue();
unsigned ResNo = StoreValue.getResNo();
Type *StoreType = StoreValue.getValueType().getTypeForEVT(Ctx);
if (CallRetResNo == ResNo || !ST->isSimple() ||
// If the pointer value does not come from the IR, it could come from ABI
// lowering and may alias with the arguments of the library call if they are
// passed via the stack.
Copy link
Collaborator

@efriedma-quic efriedma-quic Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, the issue here is that the store destination could be part of the argument list of another call? The correct way to handle that case should involve callseq_start/callseq_end. Assuming the libcall isn't nested inside another callseq_start/end, there should be a callseq_end and a callseq_start between the libcall and the store. And those nodes should prevent the transform because they "alias" the store.

Blanket ignoring all non-IR values maybe works for this particular testcase, but I don't think it solves aliasing in general.

Copy link
Member Author

@MacDue MacDue Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding (possibly wrong) is you have two calls:

sincos(x, ptr, ptr)
g(double, double)

All the arguments are passed on the stack, and the call to sincos is rewritten to be passed the stack locations of the arguments to g directly. The issue is the slack locations of g's arguments alias the stack locations of sincos's arguments, so when sincos writes to one of the pointers it clobbers its own arguments.

The current code is already checking the two pointers for sin and cos don't alias, but it does not expect the pointers could alias its own arguments.

Copy link
Member Author

@MacDue MacDue Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct way to handle that case should involve callseq_start/callseq_end. Assuming the libcall isn't nested inside another callseq_start/end, there should be a callseq_end and a callseq_start between the libcall and the store.

It looks like this is the case. Would a potential fix be to check that a CALLSEQ_START does not occur in the chain of the store, or have I misunderstood?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't have to special case CALLSEQ_START. This restriction should be implicitly present in the chain

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify that? Before this expansion you have an frexp or fsincos where the users are stores within the call sequence of another function (g in the previous example). There are no aliasing issues at this point. The issue occurs when the node is replaced with a library call. It takes its input chains from the previous stores, so when expanded you have a call sequence for the node nested inside a call sequence for g 😬. This is where the issues come from as both calls used the same area for stack arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought about the issue with folding stores into sincos while expanding it to a library call, the possible solutions I came up with are:

  1. Check the stores are not within a CALLSEQ_START-CALLSEQ_END pair
    • As far as I can tell nobody has done this before -- grepping the codebase (and it's possibly a costly graph search)
  2. Disallow stores to pointers that don't exist in the IR
    • Much simpler -- and I think could solve the issue, but it's assuming a store to an IR pointer won't occur within a CALLSEQ_START - CALLSEQ_END pair
    • Maybe this is a valid assumption (the stores I've seen have always been to special stack slots for the call), but maybe this could be broken.
  3. Be very conservative and restrict this to trivial cases (e.g. the case where the store input chains are the entry node)

The issue this is trying to avoid is if the stores that are to be folded into sincos are within a CALLSEQ_START-CALLSEQ_END pair, then the expansion will result in nested call sequences, which is where the issues come from.

const Value *PointerValue =
dyn_cast_or_null<const Value *>(ST->getPointerInfo().V);
if (!PointerValue || CallRetResNo == ResNo || !ST->isSimple() ||
ST->getAddressSpace() != 0 ||
ST->getAlign() <
getDataLayout().getABITypeAlign(StoreType->getScalarType()) ||
Expand Down
192 changes: 112 additions & 80 deletions llvm/test/CodeGen/RISCV/llvm.frexp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -543,42 +543,50 @@ define i32 @test_frexp_f32_i32_only_use_exp(float %a) nounwind {
define { <4 x float>, <4 x i32> } @test_frexp_v4f32_v4i32(<4 x float> %a) nounwind {
; RV32IFD-LABEL: test_frexp_v4f32_v4i32:
; RV32IFD: # %bb.0:
; RV32IFD-NEXT: addi sp, sp, -48
; RV32IFD-NEXT: sw ra, 44(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: sw s0, 40(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: fsd fs0, 32(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs1, 24(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs2, 16(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs3, 8(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: addi sp, sp, -64
; RV32IFD-NEXT: sw ra, 60(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: sw s0, 56(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: fsd fs0, 48(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs1, 40(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs2, 32(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs3, 24(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fmv.s fs0, fa3
; RV32IFD-NEXT: fmv.s fs1, fa2
; RV32IFD-NEXT: fmv.s fs2, fa1
; RV32IFD-NEXT: mv s0, a0
; RV32IFD-NEXT: addi a0, a0, 16
; RV32IFD-NEXT: addi a0, sp, 8
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: fmv.s fs3, fa0
; RV32IFD-NEXT: addi a0, s0, 20
; RV32IFD-NEXT: addi a0, sp, 12
; RV32IFD-NEXT: fmv.s fa0, fs2
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: fmv.s fs2, fa0
; RV32IFD-NEXT: addi a0, s0, 24
; RV32IFD-NEXT: addi a0, sp, 16
; RV32IFD-NEXT: fmv.s fa0, fs1
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: fmv.s fs1, fa0
; RV32IFD-NEXT: addi a0, s0, 28
; RV32IFD-NEXT: addi a0, sp, 20
; RV32IFD-NEXT: fmv.s fa0, fs0
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: lw a0, 8(sp)
; RV32IFD-NEXT: lw a1, 12(sp)
; RV32IFD-NEXT: lw a2, 16(sp)
; RV32IFD-NEXT: lw a3, 20(sp)
; RV32IFD-NEXT: sw a0, 16(s0)
; RV32IFD-NEXT: sw a1, 20(s0)
; RV32IFD-NEXT: sw a2, 24(s0)
; RV32IFD-NEXT: sw a3, 28(s0)
; RV32IFD-NEXT: fsw fs3, 0(s0)
; RV32IFD-NEXT: fsw fs2, 4(s0)
; RV32IFD-NEXT: fsw fs1, 8(s0)
; RV32IFD-NEXT: fsw fa0, 12(s0)
; RV32IFD-NEXT: lw ra, 44(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: lw s0, 40(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: fld fs0, 32(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs1, 24(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs2, 16(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs3, 8(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: addi sp, sp, 48
; RV32IFD-NEXT: lw ra, 60(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: lw s0, 56(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: fld fs0, 48(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs1, 40(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs2, 32(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs3, 24(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: addi sp, sp, 64
; RV32IFD-NEXT: ret
;
; RV64IFD-LABEL: test_frexp_v4f32_v4i32:
Expand Down Expand Up @@ -631,44 +639,52 @@ define { <4 x float>, <4 x i32> } @test_frexp_v4f32_v4i32(<4 x float> %a) nounwi
;
; RV32IZFINXZDINX-LABEL: test_frexp_v4f32_v4i32:
; RV32IZFINXZDINX: # %bb.0:
; RV32IZFINXZDINX-NEXT: addi sp, sp, -32
; RV32IZFINXZDINX-NEXT: sw ra, 28(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s0, 24(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s1, 20(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s2, 16(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s3, 12(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s4, 8(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: addi sp, sp, -48
; RV32IZFINXZDINX-NEXT: sw ra, 44(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s0, 40(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s1, 36(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s2, 32(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s3, 28(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s4, 24(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: mv s0, a4
; RV32IZFINXZDINX-NEXT: mv s1, a3
; RV32IZFINXZDINX-NEXT: mv s2, a2
; RV32IZFINXZDINX-NEXT: mv a2, a1
; RV32IZFINXZDINX-NEXT: mv s3, a0
; RV32IZFINXZDINX-NEXT: addi a1, a0, 16
; RV32IZFINXZDINX-NEXT: addi a1, sp, 8
; RV32IZFINXZDINX-NEXT: mv a0, a2
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: mv s4, a0
; RV32IZFINXZDINX-NEXT: addi a1, s3, 20
; RV32IZFINXZDINX-NEXT: addi a1, sp, 12
; RV32IZFINXZDINX-NEXT: mv a0, s2
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: mv s2, a0
; RV32IZFINXZDINX-NEXT: addi a1, s3, 24
; RV32IZFINXZDINX-NEXT: addi a1, sp, 16
; RV32IZFINXZDINX-NEXT: mv a0, s1
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: mv s1, a0
; RV32IZFINXZDINX-NEXT: addi a1, s3, 28
; RV32IZFINXZDINX-NEXT: addi a1, sp, 20
; RV32IZFINXZDINX-NEXT: mv a0, s0
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: lw a1, 8(sp)
; RV32IZFINXZDINX-NEXT: lw a2, 12(sp)
; RV32IZFINXZDINX-NEXT: lw a3, 16(sp)
; RV32IZFINXZDINX-NEXT: lw a4, 20(sp)
; RV32IZFINXZDINX-NEXT: sw a1, 16(s3)
; RV32IZFINXZDINX-NEXT: sw a2, 20(s3)
; RV32IZFINXZDINX-NEXT: sw a3, 24(s3)
; RV32IZFINXZDINX-NEXT: sw a4, 28(s3)
; RV32IZFINXZDINX-NEXT: sw s4, 0(s3)
; RV32IZFINXZDINX-NEXT: sw s2, 4(s3)
; RV32IZFINXZDINX-NEXT: sw s1, 8(s3)
; RV32IZFINXZDINX-NEXT: sw a0, 12(s3)
; RV32IZFINXZDINX-NEXT: lw ra, 28(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s0, 24(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s1, 20(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s2, 16(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s3, 12(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s4, 8(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: addi sp, sp, 32
; RV32IZFINXZDINX-NEXT: lw ra, 44(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s0, 40(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s1, 36(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s2, 32(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s3, 28(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s4, 24(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: addi sp, sp, 48
; RV32IZFINXZDINX-NEXT: ret
;
; RV64IZFINXZDINX-LABEL: test_frexp_v4f32_v4i32:
Expand Down Expand Up @@ -1080,34 +1096,41 @@ define <4 x float> @test_frexp_v4f32_v4i32_only_use_fract(<4 x float> %a) nounwi
define <4 x i32> @test_frexp_v4f32_v4i32_only_use_exp(<4 x float> %a) nounwind {
; RV32IFD-LABEL: test_frexp_v4f32_v4i32_only_use_exp:
; RV32IFD: # %bb.0:
; RV32IFD-NEXT: addi sp, sp, -32
; RV32IFD-NEXT: sw ra, 28(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: sw s0, 24(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: fsd fs0, 16(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs1, 8(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs2, 0(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fmv.s fs0, fa2
; RV32IFD-NEXT: fmv.s fs1, fa1
; RV32IFD-NEXT: fmv.s fs2, fa0
; RV32IFD-NEXT: addi sp, sp, -48
; RV32IFD-NEXT: sw ra, 44(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: sw s0, 40(sp) # 4-byte Folded Spill
; RV32IFD-NEXT: fsd fs0, 32(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs1, 24(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fsd fs2, 16(sp) # 8-byte Folded Spill
; RV32IFD-NEXT: fmv.s fs0, fa3
; RV32IFD-NEXT: fmv.s fs1, fa2
; RV32IFD-NEXT: fmv.s fs2, fa1
; RV32IFD-NEXT: mv s0, a0
; RV32IFD-NEXT: addi a0, a0, 12
; RV32IFD-NEXT: fmv.s fa0, fa3
; RV32IFD-NEXT: mv a0, sp
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: addi a0, s0, 8
; RV32IFD-NEXT: fmv.s fa0, fs0
; RV32IFD-NEXT: addi a0, sp, 4
; RV32IFD-NEXT: fmv.s fa0, fs2
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: addi a0, s0, 4
; RV32IFD-NEXT: addi a0, sp, 8
; RV32IFD-NEXT: fmv.s fa0, fs1
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: fmv.s fa0, fs2
; RV32IFD-NEXT: mv a0, s0
; RV32IFD-NEXT: addi a0, sp, 12
; RV32IFD-NEXT: fmv.s fa0, fs0
; RV32IFD-NEXT: call frexpf
; RV32IFD-NEXT: lw ra, 28(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: lw s0, 24(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: fld fs0, 16(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs1, 8(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs2, 0(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: addi sp, sp, 32
; RV32IFD-NEXT: lw a0, 0(sp)
; RV32IFD-NEXT: lw a1, 4(sp)
; RV32IFD-NEXT: lw a2, 8(sp)
; RV32IFD-NEXT: lw a3, 12(sp)
; RV32IFD-NEXT: sw a0, 0(s0)
; RV32IFD-NEXT: sw a1, 4(s0)
; RV32IFD-NEXT: sw a2, 8(s0)
; RV32IFD-NEXT: sw a3, 12(s0)
; RV32IFD-NEXT: lw ra, 44(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: lw s0, 40(sp) # 4-byte Folded Reload
; RV32IFD-NEXT: fld fs0, 32(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs1, 24(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: fld fs2, 16(sp) # 8-byte Folded Reload
; RV32IFD-NEXT: addi sp, sp, 48
; RV32IFD-NEXT: ret
;
; RV64IFD-LABEL: test_frexp_v4f32_v4i32_only_use_exp:
Expand Down Expand Up @@ -1151,34 +1174,43 @@ define <4 x i32> @test_frexp_v4f32_v4i32_only_use_exp(<4 x float> %a) nounwind {
;
; RV32IZFINXZDINX-LABEL: test_frexp_v4f32_v4i32_only_use_exp:
; RV32IZFINXZDINX: # %bb.0:
; RV32IZFINXZDINX-NEXT: addi sp, sp, -32
; RV32IZFINXZDINX-NEXT: sw ra, 28(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s0, 24(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s1, 20(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s2, 16(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s3, 12(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: mv s0, a3
; RV32IZFINXZDINX-NEXT: mv s1, a2
; RV32IZFINXZDINX-NEXT: mv s2, a1
; RV32IZFINXZDINX-NEXT: addi sp, sp, -48
; RV32IZFINXZDINX-NEXT: sw ra, 44(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s0, 40(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s1, 36(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s2, 32(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: sw s3, 28(sp) # 4-byte Folded Spill
; RV32IZFINXZDINX-NEXT: mv s0, a4
; RV32IZFINXZDINX-NEXT: mv s1, a3
; RV32IZFINXZDINX-NEXT: mv s2, a2
; RV32IZFINXZDINX-NEXT: mv a2, a1
; RV32IZFINXZDINX-NEXT: mv s3, a0
; RV32IZFINXZDINX-NEXT: addi a1, a0, 12
; RV32IZFINXZDINX-NEXT: mv a0, a4
; RV32IZFINXZDINX-NEXT: addi a1, sp, 12
; RV32IZFINXZDINX-NEXT: mv a0, a2
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: addi a1, s3, 8
; RV32IZFINXZDINX-NEXT: mv a0, s0
; RV32IZFINXZDINX-NEXT: addi a1, sp, 16
; RV32IZFINXZDINX-NEXT: mv a0, s2
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: addi a1, s3, 4
; RV32IZFINXZDINX-NEXT: addi a1, sp, 20
; RV32IZFINXZDINX-NEXT: mv a0, s1
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: mv a0, s2
; RV32IZFINXZDINX-NEXT: mv a1, s3
; RV32IZFINXZDINX-NEXT: addi a1, sp, 24
; RV32IZFINXZDINX-NEXT: mv a0, s0
; RV32IZFINXZDINX-NEXT: call frexpf
; RV32IZFINXZDINX-NEXT: lw ra, 28(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s0, 24(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s1, 20(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s2, 16(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s3, 12(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: addi sp, sp, 32
; RV32IZFINXZDINX-NEXT: lw a0, 12(sp)
; RV32IZFINXZDINX-NEXT: lw a1, 16(sp)
; RV32IZFINXZDINX-NEXT: lw a2, 20(sp)
; RV32IZFINXZDINX-NEXT: lw a3, 24(sp)
; RV32IZFINXZDINX-NEXT: sw a0, 0(s3)
; RV32IZFINXZDINX-NEXT: sw a1, 4(s3)
; RV32IZFINXZDINX-NEXT: sw a2, 8(s3)
; RV32IZFINXZDINX-NEXT: sw a3, 12(s3)
; RV32IZFINXZDINX-NEXT: lw ra, 44(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s0, 40(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s1, 36(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s2, 32(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: lw s3, 28(sp) # 4-byte Folded Reload
; RV32IZFINXZDINX-NEXT: addi sp, sp, 48
; RV32IZFINXZDINX-NEXT: ret
;
; RV64IZFINXZDINX-LABEL: test_frexp_v4f32_v4i32_only_use_exp:
Expand Down
45 changes: 33 additions & 12 deletions llvm/test/CodeGen/X86/llvm.frexp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -325,27 +325,28 @@ define { <4 x float>, <4 x i32> } @test_frexp_v4f32_v4i32(<4 x float> %a) {
;
; WIN32-LABEL: test_frexp_v4f32_v4i32:
; WIN32: # %bb.0:
; WIN32-NEXT: pushl %edi
; WIN32-NEXT: pushl %esi
; WIN32-NEXT: subl $44, %esp
; WIN32-NEXT: subl $60, %esp
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %esi
; WIN32-NEXT: leal 24(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: fstpl {{[-0-9]+}}(%e{{[sb]}}p) # 8-byte Folded Spill
; WIN32-NEXT: leal 20(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: fstpl {{[-0-9]+}}(%e{{[sb]}}p) # 8-byte Folded Spill
; WIN32-NEXT: leal 16(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: leal 28(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
Expand All @@ -360,13 +361,22 @@ define { <4 x float>, <4 x i32> } @test_frexp_v4f32_v4i32(<4 x float> %a) {
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %edx
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %edi
; WIN32-NEXT: movl %edi, 28(%esi)
; WIN32-NEXT: movl %edx, 24(%esi)
; WIN32-NEXT: movl %ecx, 20(%esi)
; WIN32-NEXT: movl %eax, 16(%esi)
; WIN32-NEXT: fstps 12(%esi)
; WIN32-NEXT: fstps 8(%esi)
; WIN32-NEXT: fstps 4(%esi)
; WIN32-NEXT: fstps (%esi)
; WIN32-NEXT: movl %esi, %eax
; WIN32-NEXT: addl $44, %esp
; WIN32-NEXT: addl $60, %esp
; WIN32-NEXT: popl %esi
; WIN32-NEXT: popl %edi
; WIN32-NEXT: retl
%result = call { <4 x float>, <4 x i32> } @llvm.frexp.v4f32.v4i32(<4 x float> %a)
ret { <4 x float>, <4 x i32> } %result
Expand Down Expand Up @@ -489,35 +499,46 @@ define <4 x i32> @test_frexp_v4f32_v4i32_only_use_exp(<4 x float> %a) {
;
; WIN32-LABEL: test_frexp_v4f32_v4i32_only_use_exp:
; WIN32: # %bb.0:
; WIN32-NEXT: pushl %edi
; WIN32-NEXT: pushl %esi
; WIN32-NEXT: subl $12, %esp
; WIN32-NEXT: subl $28, %esp
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %esi
; WIN32-NEXT: leal 8(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: fstp %st(0)
; WIN32-NEXT: leal 4(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: fstp %st(0)
; WIN32-NEXT: leal 12(%esi), %eax
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: fstp %st(0)
; WIN32-NEXT: movl %esi, {{[0-9]+}}(%esp)
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl %eax, {{[0-9]+}}(%esp)
; WIN32-NEXT: flds {{[0-9]+}}(%esp)
; WIN32-NEXT: fstpl (%esp)
; WIN32-NEXT: calll _frexp
; WIN32-NEXT: fstp %st(0)
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %eax
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %edx
; WIN32-NEXT: movl {{[0-9]+}}(%esp), %edi
; WIN32-NEXT: movl %edi, 12(%esi)
; WIN32-NEXT: movl %edx, 8(%esi)
; WIN32-NEXT: movl %ecx, 4(%esi)
; WIN32-NEXT: movl %eax, (%esi)
; WIN32-NEXT: movl %esi, %eax
; WIN32-NEXT: addl $12, %esp
; WIN32-NEXT: addl $28, %esp
; WIN32-NEXT: popl %esi
; WIN32-NEXT: popl %edi
; WIN32-NEXT: retl
%result = call { <4 x float>, <4 x i32> } @llvm.frexp.v4f32.v4i32(<4 x float> %a)
%result.1 = extractvalue { <4 x float>, <4 x i32> } %result, 1
Expand Down
Loading
Loading