Skip to content

Commit 869a0e4

Browse files
committed
[LSV] Merge contiguous chains across scalar types
This change enables the Load/Store Vectorizer to merge and vectorize contiguous chains even when their scalar element types differ, as long as the total bitwidth matches. To do so, we rebase offsets between chains, normalize value types to a common integer type, and insert the necessary casts around loads and stores. This uncovers more vectorization opportunities and explains the expected codegen updates across AMDGPU tests. Key changes: - Chain merging - Build contiguous subchains and then merge adjacent ones when: - They refer to the same underlying pointer object and address space. - They are either all loads or all stores. - A constant leader-to-leader delta exists. - Rebasing one chain into the other's coordinate space does not overlap. - All elements have equal total bit width. - Rebase the second chain by the computed delta and append it to the first. - Type normalization and casting - Normalize merged chains to a common integer type sized to the total bits. - For loads: create a new load of the normalized type, copy metadata, and cast back to the original type for uses if needed. - For stores: bitcast the value to the normalized type and store that. - Insert zext/trunc for integer size changes; use bit-or-pointer casts when sizes match. - Cleanups - Erase replaced instructions and DCE pointer operands when safe. - New helpers: computeLeaderDelta, chainsOverlapAfterRebase, rebaseChain, normalizeChainToType, and allElemsMatchTotalBits. Impact: - Increases vectorization opportunities across mixed-typed but size-compatible access chains. - Large set of expected AMDGPU codegen diffs due to more/changed vectorization.
1 parent 73651ba commit 869a0e4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+5691
-5419
lines changed

llvm/include/llvm/Transforms/Utils/Local.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -433,6 +433,10 @@ LLVM_ABI void combineAAMetadata(Instruction *K, const Instruction *J);
433433
/// replacement for the source instruction).
434434
LLVM_ABI void copyMetadataForLoad(LoadInst &Dest, const LoadInst &Source);
435435

436+
/// Copy the metadata from the source instruction to the destination (the
437+
/// replacement for the source instruction).
438+
LLVM_ABI void copyMetadataForStore(StoreInst &Dest, const StoreInst &Source);
439+
436440
/// Patch the replacement so that it is not more restrictive than the value
437441
/// being replaced. It assumes that the replacement does not get moved from
438442
/// its original position.

llvm/lib/Transforms/Utils/Local.cpp

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3146,6 +3146,51 @@ void llvm::copyMetadataForLoad(LoadInst &Dest, const LoadInst &Source) {
31463146
}
31473147
}
31483148

3149+
void llvm::copyMetadataForStore(StoreInst &Dest, const StoreInst &Source) {
3150+
SmallVector<std::pair<unsigned, MDNode *>, 8> MD;
3151+
Source.getAllMetadata(MD);
3152+
MDBuilder MDB(Dest.getContext());
3153+
Type *NewType = Dest.getType();
3154+
for (const auto &MDPair : MD) {
3155+
unsigned ID = MDPair.first;
3156+
MDNode *N = MDPair.second;
3157+
switch (ID) {
3158+
case LLVMContext::MD_dbg:
3159+
case LLVMContext::MD_prof:
3160+
case LLVMContext::MD_tbaa_struct:
3161+
case LLVMContext::MD_alias_scope:
3162+
case LLVMContext::MD_noalias:
3163+
case LLVMContext::MD_nontemporal:
3164+
case LLVMContext::MD_access_group:
3165+
case LLVMContext::MD_noundef:
3166+
case LLVMContext::MD_noalias_addrspace:
3167+
case LLVMContext::MD_mem_parallel_loop_access:
3168+
Dest.setMetadata(ID, N);
3169+
break;
3170+
3171+
case LLVMContext::MD_tbaa: {
3172+
MDNode *NewTyNode =
3173+
MDB.createTBAAScalarTypeNode(NewType->getStructName(), N);
3174+
Dest.setMetadata(LLVMContext::MD_tbaa, NewTyNode);
3175+
break;
3176+
}
3177+
case LLVMContext::MD_nonnull:
3178+
break;
3179+
3180+
case LLVMContext::MD_align:
3181+
case LLVMContext::MD_dereferenceable:
3182+
case LLVMContext::MD_dereferenceable_or_null:
3183+
// These only directly apply if the new type is also a pointer.
3184+
if (NewType->isPointerTy())
3185+
Dest.setMetadata(ID, N);
3186+
break;
3187+
3188+
case LLVMContext::MD_range:
3189+
break;
3190+
}
3191+
}
3192+
}
3193+
31493194
void llvm::patchReplacementInstruction(Instruction *I, Value *Repl) {
31503195
auto *ReplInst = dyn_cast<Instruction>(Repl);
31513196
if (!ReplInst)

0 commit comments

Comments
 (0)