Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2585,6 +2585,8 @@ SDValue DAGTypeLegalizer::PromoteIntOp_ExpOp(SDNode *N) {
: RTLIB::getLDEXP(N->getValueType(0));

if (LC == RTLIB::UNKNOWN_LIBCALL || !TLI.getLibcallName(LC)) {
if (N->getValueType(0).isVector())
return DAG.UnrollVectorOp(N);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function is actually the wrong place to split the vector (see how there are no other UnrollVectorOps uses in DAGTypeLegalizer). The description also says your problem is when the libcall is used, so you'd want to change the other path? Does the libcall emission below need to directly handle the vector case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not entirely sure I understand what you mean. Could you please clarify?
Are you referring to the approach used in my first commit 7f5a128 ? If so, in that commit, you mentioned not to check for the SIGN_EXTEND_INREG node, but this node is generated in this if statement here.(SExtPromotedInteger will generate a SIGN_EXTEND_INREG node).

  if (LC == RTLIB::UNKNOWN_LIBCALL || !TLI.getLibcallName(LC)) {
    SmallVector<SDValue, 3> NewOps(N->ops());
    NewOps[1 + OpOffset] = SExtPromotedInteger(N->getOperand(1 + OpOffset));
    return SDValue(DAG.UpdateNodeOperands(N, NewOps), 0);
  }

Other info:

  1. RISCV64 and Loongarch64 will enter function PromoteIntOp_ExpOp and generate a SIGN_EXTEND_INREG node, so ExponentHasSizeOfInt=false(32 != 64) and report an error.
bool ExponentHasSizeOfInt =
        DAG.getLibInfo().getIntSize() ==
        Node->getOperand(1 + Offset).getValueType().getSizeInBits();

Additionally, AArch64 and X86 will get ExponentHasSizeOfInt=true, because in these backends, i32 is valid and was not promoted to SIGN_EXTEND_INREG(i64) previously. So, they generate right code sequences.

  1. In the entire LLVM, only fpowi and fldexp will call the PromoteIntOp_ExpOp function.(Flow: llvm::DAGTypeLegalizer::PromoteIntegerOperand -> llvm::DAGTypeLegalizer::PromoteIntOp_ExpOp)
  case ISD::FPOWI:
  case ISD::STRICT_FPOWI:
  case ISD::FLDEXP:
  case ISD::STRICT_FLDEXP: Res = PromoteIntOp_ExpOp(N); break;

If not, could you provide more specific modification suggestions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function is actually the wrong place to split the vector (see how there are no other UnrollVectorOps uses in DAGTypeLegalizer). The description also says your problem is when the libcall is used, so you'd want to change the other path? Does the libcall emission below need to directly handle the vector case?

@topperc what are your thoughts on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see how there are no other UnrollVectorOps uses in DAGTypeLegalizer

There are calls to UnrollVectorOps in several places in LegalizeVectorTypes.cpp

This comment where the libcall below is created seems relevant

  // We can't just promote the exponent type in FPOWI, since we want to lower
  // the node to a libcall and we if we promote to a type larger than
  // sizeof(int) the libcall might not be according to the targets ABI.

My suggestion to unroll here was so that we wouldn't promote past sizeof(int). If we wait until LegalizeDAG.cpp to unroll the operation, the damage to the integer type has already been done. For RISC-V its harmless because signed int is supposed to be passed sign extended to 64-bits according to the ABI.

Hypothetically, if powi took an unsigned int as an argument, then type legalization would use zero extend, but the RISC-V ABI wants unsigned int to be passed sign extended. So LegalizeDAG would need to insert a SIGN_EXTEND_INREG to fix. I guess it would need to use the getIntSize() and shouldSignExtendTypeInLibCall to know what it needs to do in that case.

If we don't unroll here I guess the best fix in LegalizeDAG would also be to use getIntSize() and shouldSignExtendTypeInLibCall and use computeNumSignBits to know if a SIGN_EXTEND_INREG needs to be inserted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes more sense to me to handle this when emitting the call, where ABI constraints would naturally be handled. This can be sign extended here, and the call emission can truncate or sext_inreg as required

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes more sense to me to handle this when emitting the call, where ABI constraints would naturally be handled. This can be sign extended here, and the call emission can truncate or sext_inreg as required

The POWI code in LegalizeDAG calls SelectionDAGLegalize::ExpandFPLibCall which will call TargetLowering::makeLibCall using the promoted type. makeLibCall calls getTypeForEVT which will return i64 due to the promotion. That's what will be used by calling lowering, but that's the wrong type for it to do the right thing. We need to get an i32 Type* into call lowering, but we no longer have it. We'll need to call getIntSize() to get the size and pass it along somehow. That requires refactoring several interfaces or adding new ones. Not sure if call lowering would also expect the SDValue for the argument to have i32 type.

My unrolling proposal avoided that by scalarizing it and letting the newly created scalar powi calls get converted to libcalls while we're still in type legalization. That's how we currently handle the scalar powi case. Are you also suggesting we should move the scalar handling to LegalizeDAG as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makeLibCall should really be looking at the signature of the underlying call, but RuntimeLibcalls currently does not record this information. TargetLibraryInfo does, which is separate for some reason. This keeps coming up as a problem, these really need to be merged in some way (cc @jhuber6).

Taking the type from the DAG node isn't strictly correct, it's just where we've ended up. This came up recently for the special case in ExpandFPLibCall to sign extend the integer argument for FLDEXP.

Practically speaking, I don't think any targets will have a vector powi implementation (I at least don't see any in RuntimeLibcalls), so unrolling works out. I guess this could get a fixme and go ahead for now, but it's still a hack

SmallVector<SDValue, 3> NewOps(N->ops());
NewOps[1 + OpOffset] = SExtPromotedInteger(N->getOperand(1 + OpOffset));
return SDValue(DAG.UpdateNodeOperands(N, NewOps), 0);
Expand Down
142 changes: 142 additions & 0 deletions llvm/test/CodeGen/LoongArch/lasx/fldexp.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
; RUN: llc --mtriple=loongarch64 --mattr=+lasx < %s | FileCheck %s

declare <8 x float> @llvm.ldexp.v8f32.i32(<8 x float>, i32)

define <8 x float> @ldexp_v8f32(<8 x float> %va, i32 %b) nounwind {
; CHECK-LABEL: ldexp_v8f32:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addi.d $sp, $sp, -80
; CHECK-NEXT: st.d $ra, $sp, 72 # 8-byte Folded Spill
; CHECK-NEXT: st.d $fp, $sp, 64 # 8-byte Folded Spill
; CHECK-NEXT: xvst $xr0, $sp, 0 # 32-byte Folded Spill
; CHECK-NEXT: addi.w $fp, $a0, 0
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 0
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 0
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 1
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 1
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 2
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 2
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 3
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 3
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 4
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 4
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 5
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 5
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 6
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 6
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 7
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 7
; CHECK-NEXT: ld.d $fp, $sp, 64 # 8-byte Folded Reload
; CHECK-NEXT: ld.d $ra, $sp, 72 # 8-byte Folded Reload
; CHECK-NEXT: addi.d $sp, $sp, 80
; CHECK-NEXT: ret
entry:
%res = call <8 x float> @llvm.ldexp.v8f32.i32(<8 x float> %va, i32 %b)
ret <8 x float> %res
}

declare <4 x double> @llvm.ldexp.v4f64.i32(<4 x double>, i32)

define <4 x double> @ldexp_v4f64(<4 x double> %va, i32 %b) nounwind {
; CHECK-LABEL: ldexp_v4f64:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addi.d $sp, $sp, -80
; CHECK-NEXT: st.d $ra, $sp, 72 # 8-byte Folded Spill
; CHECK-NEXT: st.d $fp, $sp, 64 # 8-byte Folded Spill
; CHECK-NEXT: xvst $xr0, $sp, 0 # 32-byte Folded Spill
; CHECK-NEXT: addi.w $fp, $a0, 0
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 0
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexp)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 0
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 1
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexp)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 1
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 2
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexp)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 2
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 3
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexp)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 3
; CHECK-NEXT: ld.d $fp, $sp, 64 # 8-byte Folded Reload
; CHECK-NEXT: ld.d $ra, $sp, 72 # 8-byte Folded Reload
; CHECK-NEXT: addi.d $sp, $sp, 80
; CHECK-NEXT: ret
entry:
%res = call <4 x double> @llvm.ldexp.v4f64.i32(<4 x double> %va, i32 %b)
ret <4 x double> %res
}
142 changes: 142 additions & 0 deletions llvm/test/CodeGen/LoongArch/lasx/fpowi.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
; RUN: llc --mtriple=loongarch64 --mattr=+lasx < %s | FileCheck %s

declare <8 x float> @llvm.powi.v8f32.i32(<8 x float>, i32)

define <8 x float> @powi_v8f32(<8 x float> %va, i32 %b) nounwind {
; CHECK-LABEL: powi_v8f32:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addi.d $sp, $sp, -80
; CHECK-NEXT: st.d $ra, $sp, 72 # 8-byte Folded Spill
; CHECK-NEXT: st.d $fp, $sp, 64 # 8-byte Folded Spill
; CHECK-NEXT: xvst $xr0, $sp, 0 # 32-byte Folded Spill
; CHECK-NEXT: addi.w $fp, $a0, 0
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 0
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 0
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 1
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 1
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 2
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 2
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 3
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 3
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 4
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 4
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 5
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 5
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 6
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 6
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.w $a0, $xr0, 7
; CHECK-NEXT: movgr2fr.w $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powisf2)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.w $xr0, $a0, 7
; CHECK-NEXT: ld.d $fp, $sp, 64 # 8-byte Folded Reload
; CHECK-NEXT: ld.d $ra, $sp, 72 # 8-byte Folded Reload
; CHECK-NEXT: addi.d $sp, $sp, 80
; CHECK-NEXT: ret
entry:
%res = call <8 x float> @llvm.powi.v8f32.i32(<8 x float> %va, i32 %b)
ret <8 x float> %res
}

declare <4 x double> @llvm.powi.v4f64.i32(<4 x double>, i32)

define <4 x double> @powi_v4f64(<4 x double> %va, i32 %b) nounwind {
; CHECK-LABEL: powi_v4f64:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addi.d $sp, $sp, -80
; CHECK-NEXT: st.d $ra, $sp, 72 # 8-byte Folded Spill
; CHECK-NEXT: st.d $fp, $sp, 64 # 8-byte Folded Spill
; CHECK-NEXT: xvst $xr0, $sp, 0 # 32-byte Folded Spill
; CHECK-NEXT: addi.w $fp, $a0, 0
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 0
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powidf2)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 0
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 1
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powidf2)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 1
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 2
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powidf2)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 2
; CHECK-NEXT: xvst $xr0, $sp, 32 # 32-byte Folded Spill
; CHECK-NEXT: xvld $xr0, $sp, 0 # 32-byte Folded Reload
; CHECK-NEXT: xvpickve2gr.d $a0, $xr0, 3
; CHECK-NEXT: movgr2fr.d $fa0, $a0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(__powidf2)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: xvld $xr0, $sp, 32 # 32-byte Folded Reload
; CHECK-NEXT: xvinsgr2vr.d $xr0, $a0, 3
; CHECK-NEXT: ld.d $fp, $sp, 64 # 8-byte Folded Reload
; CHECK-NEXT: ld.d $ra, $sp, 72 # 8-byte Folded Reload
; CHECK-NEXT: addi.d $sp, $sp, 80
; CHECK-NEXT: ret
entry:
%res = call <4 x double> @llvm.powi.v4f64.i32(<4 x double> %va, i32 %b)
ret <4 x double> %res
}
88 changes: 88 additions & 0 deletions llvm/test/CodeGen/LoongArch/lsx/fldexp.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
; RUN: llc --mtriple=loongarch64 --mattr=+lsx < %s | FileCheck %s

declare <4 x float> @llvm.ldexp.v4f32.i32(<4 x float>, i32)

define <4 x float> @ldexp_v4f32(<4 x float> %va, i32 %b) nounwind {
; CHECK-LABEL: ldexp_v4f32:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addi.d $sp, $sp, -48
; CHECK-NEXT: st.d $ra, $sp, 40 # 8-byte Folded Spill
; CHECK-NEXT: st.d $fp, $sp, 32 # 8-byte Folded Spill
; CHECK-NEXT: vst $vr0, $sp, 0 # 16-byte Folded Spill
; CHECK-NEXT: addi.w $fp, $a0, 0
; CHECK-NEXT: vreplvei.w $vr0, $vr0, 0
; CHECK-NEXT: # kill: def $f0 killed $f0 killed $vr0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: vinsgr2vr.w $vr0, $a0, 0
; CHECK-NEXT: vst $vr0, $sp, 16 # 16-byte Folded Spill
; CHECK-NEXT: vld $vr0, $sp, 0 # 16-byte Folded Reload
; CHECK-NEXT: vreplvei.w $vr0, $vr0, 1
; CHECK-NEXT: # kill: def $f0 killed $f0 killed $vr0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: vld $vr0, $sp, 16 # 16-byte Folded Reload
; CHECK-NEXT: vinsgr2vr.w $vr0, $a0, 1
; CHECK-NEXT: vst $vr0, $sp, 16 # 16-byte Folded Spill
; CHECK-NEXT: vld $vr0, $sp, 0 # 16-byte Folded Reload
; CHECK-NEXT: vreplvei.w $vr0, $vr0, 2
; CHECK-NEXT: # kill: def $f0 killed $f0 killed $vr0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: vld $vr0, $sp, 16 # 16-byte Folded Reload
; CHECK-NEXT: vinsgr2vr.w $vr0, $a0, 2
; CHECK-NEXT: vst $vr0, $sp, 16 # 16-byte Folded Spill
; CHECK-NEXT: vld $vr0, $sp, 0 # 16-byte Folded Reload
; CHECK-NEXT: vreplvei.w $vr0, $vr0, 3
; CHECK-NEXT: # kill: def $f0 killed $f0 killed $vr0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexpf)
; CHECK-NEXT: movfr2gr.s $a0, $fa0
; CHECK-NEXT: vld $vr0, $sp, 16 # 16-byte Folded Reload
; CHECK-NEXT: vinsgr2vr.w $vr0, $a0, 3
; CHECK-NEXT: ld.d $fp, $sp, 32 # 8-byte Folded Reload
; CHECK-NEXT: ld.d $ra, $sp, 40 # 8-byte Folded Reload
; CHECK-NEXT: addi.d $sp, $sp, 48
; CHECK-NEXT: ret
entry:
%res = call <4 x float> @llvm.ldexp.v4f32.i32(<4 x float> %va, i32 %b)
ret <4 x float> %res
}

declare <2 x double> @llvm.ldexp.v2f64.i32(<2 x double>, i32)

define <2 x double> @ldexp_v2f64(<2 x double> %va, i32 %b) nounwind {
; CHECK-LABEL: ldexp_v2f64:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: addi.d $sp, $sp, -48
; CHECK-NEXT: st.d $ra, $sp, 40 # 8-byte Folded Spill
; CHECK-NEXT: st.d $fp, $sp, 32 # 8-byte Folded Spill
; CHECK-NEXT: vst $vr0, $sp, 0 # 16-byte Folded Spill
; CHECK-NEXT: addi.w $fp, $a0, 0
; CHECK-NEXT: vreplvei.d $vr0, $vr0, 0
; CHECK-NEXT: # kill: def $f0_64 killed $f0_64 killed $vr0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexp)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: vinsgr2vr.d $vr0, $a0, 0
; CHECK-NEXT: vst $vr0, $sp, 16 # 16-byte Folded Spill
; CHECK-NEXT: vld $vr0, $sp, 0 # 16-byte Folded Reload
; CHECK-NEXT: vreplvei.d $vr0, $vr0, 1
; CHECK-NEXT: # kill: def $f0_64 killed $f0_64 killed $vr0
; CHECK-NEXT: move $a0, $fp
; CHECK-NEXT: bl %plt(ldexp)
; CHECK-NEXT: movfr2gr.d $a0, $fa0
; CHECK-NEXT: vld $vr0, $sp, 16 # 16-byte Folded Reload
; CHECK-NEXT: vinsgr2vr.d $vr0, $a0, 1
; CHECK-NEXT: ld.d $fp, $sp, 32 # 8-byte Folded Reload
; CHECK-NEXT: ld.d $ra, $sp, 40 # 8-byte Folded Reload
; CHECK-NEXT: addi.d $sp, $sp, 48
; CHECK-NEXT: ret
entry:
%res = call <2 x double> @llvm.ldexp.v2f64.i32(<2 x double> %va, i32 %b)
ret <2 x double> %res
}
Loading
Loading