-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[VPlan] Use VPInstructionWithType for uniform casts. #140623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
1384252
5cf434b
712ef4b
3cf0a98
b070f34
ce8b8c8
5c4fae7
6aa08c0
f184afc
7289bd8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -199,7 +199,7 @@ class VPRecipeBuilder { | |||||
| /// Build a VPReplicationRecipe for \p I using \p Operands. If it is | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| /// predicated, add the mask as last operand. Range.End may be decreased to | ||||||
| /// ensure same recipe behavior from \p Range.Start to \p Range.End. | ||||||
| VPReplicateRecipe *handleReplication(Instruction *I, | ||||||
| VPSingleDefRecipe *handleReplication(Instruction *I, | ||||||
| ArrayRef<VPValue *> Operands, | ||||||
| VFRange &Range); | ||||||
|
|
||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -919,6 +919,9 @@ class VPInstruction : public VPRecipeWithIRFlags, | |
| public VPUnrollPartAccessor<1> { | ||
| friend class VPlanSlp; | ||
|
|
||
| /// True if the VPInstruction produces a single scalar value. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And false if it doesn't, i.e., if the VPInstruction produces more than one scalar value, or none - produces a vector instead or produces no value at all as in branches (for stores we consider the stored value as being "produced"?) |
||
| bool IsSingleScalar; | ||
|
|
||
|
Comment on lines
+922
to
+924
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the purpose of this patch, which models uniform cast operations via VPInstructionWithType instead of VPReplicateRecipe, suffice to introduce IsSingleScalar in VPInstructionWithType, rather than VPInstruction? Adding IsSingleScalar to VPInstruction potentially lays the ground for future handling beyond VPInstructionWithType-for-scalar-casts. This raises several questions going forward: (1) what information to record - an IsSingleScalar bit or more? (2) where to record it - everywhere or only where needed? (3) how does this relate to Type information? Regarding (1): every VPValue may represent (a) a single scalar, (b) multiple scalars (VFxUF, VF after unrolling), or (c) a single vector (of VFxUF elements, VF after unrolling). Note that "VF" may differ from the one common across a vector loop, when considering interleaved groups and/or SLP. Regarding (2): Single scalar information may be inferred from uniformity/divergence analysis coupled with demanded-elements analysis (only first lane used), analogous to min-bitwidth information inferred from range propagation coupled with demanded bits. The latter encodes its results, i.e., the narrower element Type of VPValues, only where needed: in live-ins and type-changing casts, relying on VPTypeAnalysis to infer the types of other VPValues. Suffice to record single-scalar-ness also only where needed - recipes that create or change this behavior, relying on vputils::isSingleScalar() for other VPValues (possibly caching its results as in VPTypeAnalysis)? Casts OTOH typically modifying only the element type of their operand - retaining its single-scalar-ness, as in PreservesUniformity. Regarding (3): single-scalar-ness could potentially be recorded alongside the element Type, as in LLVM-IR: can use Array type for (b), scalable vector type for (c), and neither - plain element type for (a). Can alternatively introduce VPType which holds information (1) alongside but separate from the element Type. In both cases it would seem natural for VPInstructionWithType to represent information (1). |
||
| public: | ||
| /// VPlan opcodes, extending LLVM IR with idiomatics instructions. | ||
| enum { | ||
|
|
@@ -1009,7 +1012,7 @@ class VPInstruction : public VPRecipeWithIRFlags, | |
|
|
||
| VPInstruction(unsigned Opcode, ArrayRef<VPValue *> Operands, | ||
| const VPIRFlags &Flags, DebugLoc DL = {}, | ||
| const Twine &Name = ""); | ||
| const Twine &Name = "", bool IsSingleScalar = false); | ||
|
|
||
| VP_CLASSOF_IMPL(VPDef::VPInstructionSC) | ||
|
|
||
|
|
@@ -1096,8 +1099,9 @@ class VPInstructionWithType : public VPInstruction { | |
| public: | ||
| VPInstructionWithType(unsigned Opcode, ArrayRef<VPValue *> Operands, | ||
| Type *ResultTy, const VPIRFlags &Flags, DebugLoc DL, | ||
| const Twine &Name = "") | ||
| : VPInstruction(Opcode, Operands, Flags, DL, Name), ResultTy(ResultTy) {} | ||
| bool IsSingleScalar = false, const Twine &Name = "") | ||
| : VPInstruction(Opcode, Operands, Flags, DL, Name, IsSingleScalar), | ||
| ResultTy(ResultTy) {} | ||
|
|
||
| static inline bool classof(const VPRecipeBase *R) { | ||
| // VPInstructionWithType are VPInstructions with specific opcodes requiring | ||
|
|
@@ -1124,7 +1128,7 @@ class VPInstructionWithType : public VPInstruction { | |
| SmallVector<VPValue *, 2> Operands(operands()); | ||
| auto *New = | ||
| new VPInstructionWithType(getOpcode(), Operands, getResultType(), *this, | ||
| getDebugLoc(), getName()); | ||
| getDebugLoc(), isSingleScalar(), getName()); | ||
| New->setUnderlyingValue(getUnderlyingValue()); | ||
| return New; | ||
| } | ||
|
|
@@ -1133,10 +1137,7 @@ class VPInstructionWithType : public VPInstruction { | |
|
|
||
| /// Return the cost of this VPInstruction. | ||
| InstructionCost computeCost(ElementCount VF, | ||
| VPCostContext &Ctx) const override { | ||
| // TODO: Compute accurate cost after retiring the legacy cost model. | ||
| return 0; | ||
| } | ||
| VPCostContext &Ctx) const override; | ||
|
|
||
| Type *getResultType() const { return ResultTy; } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -410,9 +410,10 @@ template class VPUnrollPartAccessor<3>; | |||||||
|
|
||||||||
| VPInstruction::VPInstruction(unsigned Opcode, ArrayRef<VPValue *> Operands, | ||||||||
| const VPIRFlags &Flags, DebugLoc DL, | ||||||||
| const Twine &Name) | ||||||||
| const Twine &Name, bool IsSingleScalar) | ||||||||
| : VPRecipeWithIRFlags(VPDef::VPInstructionSC, Operands, Flags, DL), | ||||||||
| VPIRMetadata(), Opcode(Opcode), Name(Name.str()) { | ||||||||
| VPIRMetadata(), IsSingleScalar(IsSingleScalar), Opcode(Opcode), | ||||||||
| Name(Name.str()) { | ||||||||
| assert(flagsValidForOpcode(getOpcode()) && | ||||||||
| "Set flags not supported for the provided opcode"); | ||||||||
| } | ||||||||
|
|
@@ -866,7 +867,8 @@ bool VPInstruction::isVectorToScalar() const { | |||||||
| } | ||||||||
|
|
||||||||
| bool VPInstruction::isSingleScalar() const { | ||||||||
| return getOpcode() == Instruction::PHI || isScalarCast(); | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should |
||||||||
| // TODO: Set IsSingleScalar for PHI. | ||||||||
| return IsSingleScalar || getOpcode() == Instruction::PHI; | ||||||||
| } | ||||||||
|
|
||||||||
| void VPInstruction::execute(VPTransformState &State) { | ||||||||
|
|
@@ -1079,13 +1081,16 @@ void VPInstruction::print(raw_ostream &O, const Twine &Indent, | |||||||
|
|
||||||||
| void VPInstructionWithType::execute(VPTransformState &State) { | ||||||||
| State.setDebugLocFrom(getDebugLoc()); | ||||||||
| if (isScalarCast()) { | ||||||||
| if (Instruction::isCast(getOpcode())) { | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
before working with VPlan(0) only below? |
||||||||
| Value *Op = State.get(getOperand(0), VPLane(0)); | ||||||||
| Value *Cast = State.Builder.CreateCast(Instruction::CastOps(getOpcode()), | ||||||||
| Op, ResultTy); | ||||||||
| if (auto *I = dyn_cast<Instruction>(Cast)) | ||||||||
| applyFlags(*I); | ||||||||
| State.set(this, Cast, VPLane(0)); | ||||||||
| return; | ||||||||
| } | ||||||||
|
|
||||||||
| switch (getOpcode()) { | ||||||||
| case VPInstruction::StepVector: { | ||||||||
| Value *StepVector = | ||||||||
|
|
@@ -1098,6 +1103,15 @@ void VPInstructionWithType::execute(VPTransformState &State) { | |||||||
| } | ||||||||
| } | ||||||||
|
|
||||||||
| InstructionCost VPInstructionWithType::computeCost(ElementCount VF, | ||||||||
| VPCostContext &Ctx) const { | ||||||||
| // TODO: Compute cost for VPInstructions without underlying values once | ||||||||
| // the legacy cost model has been retired. | ||||||||
| if (!getUnderlyingValue()) | ||||||||
| return 0; | ||||||||
| return Ctx.getLegacyCost(cast<Instruction>(getUnderlyingValue()), VF); | ||||||||
| } | ||||||||
|
|
||||||||
| #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) | ||||||||
| void VPInstructionWithType::print(raw_ostream &O, const Twine &Indent, | ||||||||
| VPSlotTracker &SlotTracker) const { | ||||||||
|
|
@@ -1643,12 +1657,13 @@ bool VPIRFlags::flagsValidForOpcode(unsigned Opcode) const { | |||||||
| return Opcode == Instruction::FAdd || Opcode == Instruction::FMul || | ||||||||
| Opcode == Instruction::FSub || Opcode == Instruction::FNeg || | ||||||||
| Opcode == Instruction::FDiv || Opcode == Instruction::FRem || | ||||||||
| Opcode == Instruction::FPTrunc || Opcode == Instruction::FPExt || | ||||||||
| Opcode == Instruction::FCmp || Opcode == Instruction::Select || | ||||||||
| Opcode == VPInstruction::WideIVStep || | ||||||||
| Opcode == VPInstruction::ReductionStartVector || | ||||||||
| Opcode == VPInstruction::ComputeReductionResult; | ||||||||
| case OperationType::NonNegOp: | ||||||||
| return Opcode == Instruction::ZExt; | ||||||||
| return Opcode == Instruction::UIToFP || Opcode == Instruction::ZExt; | ||||||||
| break; | ||||||||
| case OperationType::Cmp: | ||||||||
| return Opcode == Instruction::ICmp; | ||||||||
|
|
||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1031,8 +1031,15 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) { | |
| unsigned ExtOpcode = match(R.getOperand(0), m_SExt(m_VPValue())) | ||
| ? Instruction::SExt | ||
| : Instruction::ZExt; | ||
| auto *VPC = | ||
| new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, TruncTy); | ||
| VPSingleDefRecipe *VPC; | ||
| if (vputils::isSingleScalar(Def)) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that this asks if a VPSingleDefRecipe Def is single scalar, where Def (not being a VPInstruction) does not record its own IsSingleScalar, defaulting to vputils function to figure it out. |
||
| VPC = new VPInstructionWithType(Instruction::CastOps(ExtOpcode), {A}, | ||
| TruncTy, {}, Def->getDebugLoc(), | ||
| /*IsSingleScalar=*/true); | ||
| else | ||
| VPC = new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, | ||
| TruncTy, {}, Def->getDebugLoc()); | ||
|
|
||
| if (auto *UnderlyingExt = R.getOperand(0)->getUnderlyingValue()) { | ||
| // UnderlyingExt has distinct return type, used to retain legacy cost. | ||
| VPC->setUnderlyingValue(UnderlyingExt); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a default behavior? I.e., if one isn't sure or doesn't care whether a VPInstruction processes a single scalar or not, then IsSingleScalar is set to false. If so, perhaps
MustBeSingleScalarorIsKnownSingleScalarwould be more accurate, and/or anUnsetalternative. In general additional alternative values may be relevant, see below.