Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1980,6 +1980,34 @@ InstructionCost RISCVTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
SlideCost = 1; // With a constant index, we do not need to use addi.
}

// When the vector needs to split into multiple register groups and the index
// exceeds single vector register group, we need to insert/extract the element
// via stack.
if (LT.first > 1 &&
((Index == -1U) || (Index >= LT.second.getVectorMinNumElements() &&
LT.second.isScalableVector()))) {
Type *ScalarType = Val->getScalarType();
Align VecAlign = DL.getPrefTypeAlign(Val);
Align SclAlign = DL.getPrefTypeAlign(ScalarType);
// Extra addi for unknown index.
InstructionCost IdxCost = Index == -1U ? 1 : 0;

// Store all split vectors into stack and load the target element.
if (Opcode == Instruction::ExtractElement)
return getMemoryOpCost(Instruction::Store, Val, VecAlign, 0, CostKind) +
getMemoryOpCost(Instruction::Load, ScalarType, SclAlign, 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing the addressing cost in both cases here. For the vector, that should be handled inside getMemoryOpCost, but you need to include the ADDI for the non-constant index case on the scalar load or store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks!

CostKind) +
IdxCost;

// Store all split vectors into stack and store the target element and load
// vectors back.
return getMemoryOpCost(Instruction::Store, Val, VecAlign, 0, CostKind) +
getMemoryOpCost(Instruction::Load, Val, VecAlign, 0, CostKind) +
getMemoryOpCost(Instruction::Store, ScalarType, SclAlign, 0,
CostKind) +
IdxCost;
}

// Extract i64 in the target that has XLEN=32 need more instruction.
if (Val->getScalarType()->isIntegerTy() &&
ST->getXLen() < Val->getScalarSizeInBits()) {
Expand Down
Loading
Loading