Skip to content

Commit b690ec5

Browse files
fodinabortstellar
authored andcommitted
[LV] Parallel annotated loop does not imply all loads can be hoisted.
As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (`!llvm.loop.parallel_accesses`), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads. This was observed in POCL (pocl/pocl#757) and would require similar workarounds in current work at hipSYCL. The question remains why this was initially added and what the implications of removing this optimization would be. Do we need an alternative mechanism to propagate the information about legality of if-conversion? Or is the idea that conditional loads in `#pragma clang loop vectorize(assume_safety)` can be executed unmasked without additional checks flawed in general? I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous. Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103907 (cherry picked from commit 4f01122)
1 parent 385a6f3 commit b690ec5

File tree

4 files changed

+10
-65
lines changed

4 files changed

+10
-65
lines changed

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -396,22 +396,17 @@ class LoopVectorizationLegality {
396396
bool canVectorizeOuterLoop();
397397

398398
/// Return true if all of the instructions in the block can be speculatively
399-
/// executed, and record the loads/stores that require masking. If's that
400-
/// guard loads can be ignored under "assume safety" unless \p PreserveGuards
401-
/// is true. This can happen when we introduces guards for which the original
402-
/// "unguarded-loads are safe" assumption does not hold. For example, the
403-
/// vectorizer's fold-tail transformation changes the loop to execute beyond
404-
/// its original trip-count, under a proper guard, which should be preserved.
399+
/// executed, and record the loads/stores that require masking.
405400
/// \p SafePtrs is a list of addresses that are known to be legal and we know
406401
/// that we can read from them without segfault.
407402
/// \p MaskedOp is a list of instructions that have to be transformed into
408403
/// calls to the appropriate masked intrinsic when the loop is vectorized.
409404
/// \p ConditionalAssumes is a list of assume instructions in predicated
410405
/// blocks that must be dropped if the CFG gets flattened.
411-
bool blockCanBePredicated(BasicBlock *BB, SmallPtrSetImpl<Value *> &SafePtrs,
412-
SmallPtrSetImpl<const Instruction *> &MaskedOp,
413-
SmallPtrSetImpl<Instruction *> &ConditionalAssumes,
414-
bool PreserveGuards = false) const;
406+
bool blockCanBePredicated(
407+
BasicBlock *BB, SmallPtrSetImpl<Value *> &SafePtrs,
408+
SmallPtrSetImpl<const Instruction *> &MaskedOp,
409+
SmallPtrSetImpl<Instruction *> &ConditionalAssumes) const;
415410

416411
/// Updates the vectorization state by adding \p Phi to the inductions list.
417412
/// This can set \p Phi as the main induction of the loop if \p Phi is a

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -925,10 +925,7 @@ bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) {
925925
bool LoopVectorizationLegality::blockCanBePredicated(
926926
BasicBlock *BB, SmallPtrSetImpl<Value *> &SafePtrs,
927927
SmallPtrSetImpl<const Instruction *> &MaskedOp,
928-
SmallPtrSetImpl<Instruction *> &ConditionalAssumes,
929-
bool PreserveGuards) const {
930-
const bool IsAnnotatedParallel = TheLoop->isAnnotatedParallel();
931-
928+
SmallPtrSetImpl<Instruction *> &ConditionalAssumes) const {
932929
for (Instruction &I : *BB) {
933930
// Check that we don't have a constant expression that can trap as operand.
934931
for (Value *Operand : I.operands()) {
@@ -956,11 +953,7 @@ bool LoopVectorizationLegality::blockCanBePredicated(
956953
if (!LI)
957954
return false;
958955
if (!SafePtrs.count(LI->getPointerOperand())) {
959-
// !llvm.mem.parallel_loop_access implies if-conversion safety.
960-
// Otherwise, record that the load needs (real or emulated) masking
961-
// and let the cost model decide.
962-
if (!IsAnnotatedParallel || PreserveGuards)
963-
MaskedOp.insert(LI);
956+
MaskedOp.insert(LI);
964957
continue;
965958
}
966959
}
@@ -1276,8 +1269,7 @@ bool LoopVectorizationLegality::prepareToFoldTailByMasking() {
12761269
// do not need predication such as the header block.
12771270
for (BasicBlock *BB : TheLoop->blocks()) {
12781271
if (!blockCanBePredicated(BB, SafePointers, TmpMaskedOp,
1279-
TmpConditionalAssumes,
1280-
/* MaskAllLoads= */ true)) {
1272+
TmpConditionalAssumes)) {
12811273
LLVM_DEBUG(dbgs() << "LV: Cannot fold tail by masking as requested.\n");
12821274
return false;
12831275
}

llvm/test/Transforms/LoopVectorize/X86/force-ifcvt.ll

Lines changed: 0 additions & 42 deletions
This file was deleted.

llvm/test/Transforms/LoopVectorize/X86/tail_folding_and_assume_safety.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ for.inc:
5151
br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !8
5252
}
5353

54-
; Case2: With pragma assume_safety only the store is masked.
54+
; Case2: With pragma assume_safety both, load and store are masked.
5555
; void assume_safety(int * p, int * q1, int * q2, int guard) {
5656
; #pragma clang loop vectorize(assume_safety)
5757
; for(int ix=0; ix < 1021; ++ix) {
@@ -63,7 +63,7 @@ for.inc:
6363

6464
;CHECK-LABEL: @assume_safety
6565
;CHECK: vector.body:
66-
;CHECK-NOT: @llvm.masked.load
66+
;CHECK: call <8 x i32> @llvm.masked.load
6767
;CHECK: call void @llvm.masked.store
6868

6969
; Function Attrs: norecurse nounwind uwtable

0 commit comments

Comments
 (0)