Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Nov 23, 2025

VPVector(End)PointerRecipes are single-scalar if all their operands are. This should be effectively NFC currently, but it should re-enable cost checking for some more VPWidenMemoryRecipe after
#157387 as discovered by @john-brawn-arm.

VPVector(End)PointerRecipes are single-scalar if all their operands are.
This should be effectively NFC currently, but it should re-enable cost
checking for some more VPWidenMemoryRecipe after
llvm#157387 as discovered by
@john-brawn-arm.
@llvmbot
Copy link
Member

llvmbot commented Nov 23, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

VPVector(End)PointerRecipes are single-scalar if all their operands are. This should be effectively NFC currently, but it should re-enable cost checking for some more VPWidenMemoryRecipe after
#157387 as discovered by @john-brawn-arm.


Full diff: https://github.com/llvm/llvm-project/pull/169249.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanUtils.cpp (+2-1)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp b/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
index 939216fe162a4..334ad973c5428 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
@@ -185,7 +185,8 @@ bool vputils::isSingleScalar(const VPValue *VPV) {
                                      all_of(Rep->operands(), isSingleScalar));
   }
   if (isa<VPWidenGEPRecipe, VPDerivedIVRecipe, VPBlendRecipe,
-          VPWidenSelectRecipe>(VPV))
+          VPWidenSelectRecipe, VPVectorPointerRecipe, VPVectorEndPointerRecipe>(
+          VPV))
     return all_of(VPV->getDefiningRecipe()->operands(), isSingleScalar);
   if (auto *WidenR = dyn_cast<VPWidenRecipe>(VPV)) {
     return preservesUniformity(WidenR->getOpcode()) &&

@llvmbot
Copy link
Member

llvmbot commented Nov 23, 2025

@llvm/pr-subscribers-vectorizers

Author: Florian Hahn (fhahn)

Changes

VPVector(End)PointerRecipes are single-scalar if all their operands are. This should be effectively NFC currently, but it should re-enable cost checking for some more VPWidenMemoryRecipe after
#157387 as discovered by @john-brawn-arm.


Full diff: https://github.com/llvm/llvm-project/pull/169249.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanUtils.cpp (+2-1)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp b/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
index 939216fe162a4..334ad973c5428 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
@@ -185,7 +185,8 @@ bool vputils::isSingleScalar(const VPValue *VPV) {
                                      all_of(Rep->operands(), isSingleScalar));
   }
   if (isa<VPWidenGEPRecipe, VPDerivedIVRecipe, VPBlendRecipe,
-          VPWidenSelectRecipe>(VPV))
+          VPWidenSelectRecipe, VPVectorPointerRecipe, VPVectorEndPointerRecipe>(
+          VPV))
     return all_of(VPV->getDefiningRecipe()->operands(), isSingleScalar);
   if (auto *WidenR = dyn_cast<VPWidenRecipe>(VPV)) {
     return preservesUniformity(WidenR->getOpcode()) &&

Copy link
Contributor

@Mel-Chen Mel-Chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we directly return true, like VPReductionRecipe?

@artagnon
Copy link
Contributor

after
#157387 as discovered by @john-brawn-arm.

Wrong person mentioned? Should be @ElvisWang123?

Copy link
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after
Could we directly return true, like VPReductionRecipe?

Yep that works, updated, thanks

#157387 as discovered by @john-brawn-arm.

Wrong person mentioned? Should be @ElvisWang123?

The mention should be correct, @ElvisWang123 landed #157387, but @john-brawn-arm discovered the issue. I'll strip to the username mentions from the commit message, to avoid excessive Github notifications

@fhahn fhahn merged commit a51e2ef into llvm:main Nov 25, 2025
10 checks passed
@fhahn fhahn deleted the vplan-vectorpointer-single-scalar branch November 25, 2025 14:46
@alexfh
Copy link
Contributor

alexfh commented Dec 1, 2025

Hi @fhahn, this seems to trigger assertion failures in Clang: https://gcc.godbolt.org/z/87jE85Tdq

clang++: /root/llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7205: llvm::VectorizationFactor llvm::LoopVectorizationPlanner::computeBestVF(): Assertion `(BestFactor.Width == LegacyVF.Width || BestPlan.hasEarlyExit() || !Legal->getLAI()->getSymbolicStrides().empty() || UsesEVLGatherScatter || planContainsAdditionalSimplifications( getPlanFor(BestFactor.Width), CostCtx, OrigLoop, BestFactor.Width) || planContainsAdditionalSimplifications( getPlanFor(LegacyVF.Width), CostCtx, OrigLoop, LegacyVF.Width)) && " VPlan cost model and legacy cost model disagreed"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /opt/compiler-explorer/clang-assertions-trunk/bin/clang++ -g -o /app/output.s -mllvm --x86-asm-syntax=intel -fverbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics -xir -O3 <source>
1.	Optimizer
2.	Running pass "function<eager-inv>(drop-unnecessary-assumes,float2int,lower-constant-intrinsics,chr,loop(loop-rotate<header-duplication;no-prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>,drop-unnecessary-assumes,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-arithmetic;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,slp-vectorizer,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;switch-to-arithmetic;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>)" on module "<source>"
3.	Running pass "loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>" on function "eggs"
 #0 0x0000000004279728 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4279728)
 #1 0x0000000004276b54 llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4276b54)
 #2 0x00000000041ba888 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007d8f25c42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007d8f25c969fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
 #5 0x00007d8f25c42476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #6 0x00007d8f25c287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #7 0x00007d8f25c2871b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
 #8 0x00007d8f25c39e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
 #9 0x0000000005dff188 llvm::LoopVectorizationPlanner::computeBestVF() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x5dff188)
#10 0x0000000005e0134c llvm::LoopVectorizePass::processLoop(llvm::Loop*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x5e0134c)
#11 0x0000000005e03f10 llvm::LoopVectorizePass::runImpl(llvm::Function&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x5e03f10)
#12 0x0000000005e0463b llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x5e0463b)
#13 0x0000000005447e9e llvm::detail::PassModel<llvm::Function, llvm::LoopVectorizePass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x5447e9e)
#14 0x0000000003bcba31 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x3bcba31)
#15 0x000000000126188e llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x126188e)
#16 0x0000000003bca0fa llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x3bca0fa)
#17 0x0000000001261b7e llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x1261b7e)
#18 0x0000000003bc9ab1 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x3bc9ab1)
#19 0x0000000004534203 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&, clang::BackendConsumer*) BackendUtil.cpp:0:0
#20 0x00000000045378d9 clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x45378d9)
#21 0x0000000004bc947f clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4bc947f)
#22 0x0000000004eb4455 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4eb4455)
#23 0x0000000004e3460e clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4e3460e)
#24 0x0000000004fad48d clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4fad48d)
#25 0x0000000000dc7780 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xdc7780)
#26 0x0000000000dbe1fa ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) driver.cpp:0:0
#27 0x0000000000dbe37d int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::'lambda'(llvm::SmallVectorImpl<char const*>&)>(long, llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#28 0x0000000004c313d9 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#29 0x00000000041bad24 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x41bad24)
#30 0x0000000004c319ef clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#31 0x0000000004bf21e2 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4bf21e2)
#32 0x0000000004bf318e clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4bf318e)
#33 0x0000000004bfa5c5 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4bfa5c5)
#34 0x0000000000dc3ba1 clang_main(int, char**, llvm::ToolContext const&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xdc3ba1)
#35 0x0000000000c72574 main (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xc72574)
#36 0x00007d8f25c29d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#37 0x00007d8f25c29e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#38 0x0000000000dbdc95 _start (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xdbdc95)
clang++: error: clang frontend command failed with exit code 134 (use -v to see invocation)

A reduced test case:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.foo.7 = type <{ %struct.eggs, i32, [4 x i8] }>
%struct.eggs = type { ptr, i64 }
%struct.wibble = type { ptr, i64 }

define ptr @foo(ptr %arg, i64 %arg1) {
bb:
  %alloca = alloca i64, align 8
  store i64 %arg1, ptr %alloca, align 8
  %load = load ptr, ptr %arg, align 8
  %load2 = load i64, ptr %alloca, align 8
  %getelementptr = getelementptr i8, ptr %load, i64 %load2
  ret ptr %getelementptr
}

define ptr @ham(ptr %arg) {
bb:
  store ptr @wombat, ptr %arg, align 8
  ret ptr null
}

declare ptr @wombat()

define i64 @eggs(ptr %arg) {
bb:
  %alloca = alloca i32, align 4
  store i32 0, ptr %alloca, align 4
  br label %bb1

bb1:                                              ; preds = %bb3, %bb
  %load = load i32, ptr %alloca, align 4
  %icmp = icmp slt i32 %load, 32
  br i1 %icmp, label %bb3, label %bb2

bb2:                                              ; preds = %bb1
  ret i64 0

bb3:                                              ; preds = %bb1
  %load4 = load i32, ptr %alloca, align 4
  store i32 %load4, ptr %arg, align 4
  %load5 = load i32, ptr %arg, align 4
  %call = call ptr @barney(ptr %arg, i32 %load5)
  %load6 = load i32, ptr %alloca, align 4
  %add = add i32 %load6, 1
  store i32 %add, ptr %alloca, align 4
  br label %bb1
}

define ptr @barney(ptr %arg, i32 %arg1) {
bb:
  %alloca = alloca %struct.foo.7, align 8
  %mul = mul i32 %arg1, 128
  store ptr %alloca, ptr %arg, align 8
  %load = load ptr, ptr %arg, align 8
  %getelementptr = getelementptr %struct.foo.7, ptr %load, i32 0, i32 1
  store i32 %mul, ptr %getelementptr, align 8
  %call = call ptr @bar(ptr %arg, ptr %alloca)
  ret ptr %call

; uselistorder directives
  uselistorder ptr %arg, { 2, 1, 0 }
}

define ptr @bar(ptr noalias %arg, ptr %arg1) {
bb:
  call void @wombat.1(ptr %arg, ptr %arg1)
  ret ptr null
}

define void @wombat.1(ptr %arg, ptr %arg1) {
bb:
  %alloca = alloca i32, align 4
  store i32 0, ptr %alloca, align 4
  br label %bb2

bb2:                                              ; preds = %bb4, %bb
  %load = load i32, ptr %alloca, align 4
  %icmp = icmp slt i32 %load, 64
  br i1 %icmp, label %bb4, label %bb3

bb3:                                              ; preds = %bb2
  ret void

bb4:                                              ; preds = %bb2
  %load5 = load i32, ptr %alloca, align 4
  %call = call i1 @barney.2(ptr %arg1, i32 %load5)
  %load6 = load i32, ptr %alloca, align 4
  %sext = sext i32 %load6 to i64
  %call7 = call ptr @snork(ptr %arg, i64 %sext)
  %zext = zext i1 %call to i8
  store i8 %zext, ptr %call7, align 1
  %load8 = load i32, ptr %alloca, align 4
  %add = add i32 %load8, 1
  store i32 %add, ptr %alloca, align 4
  br label %bb2
}

define i1 @barney.2(ptr %arg, i32 %arg1) {
bb:
  %alloca = alloca ptr, align 8
  %alloca2 = alloca %struct.wibble, align 8
  store ptr %arg, ptr %arg, align 8
  %load = load ptr, ptr %arg, align 8
  %getelementptr = getelementptr %struct.foo.7, ptr %load, i32 0, i32 1
  %load3 = load i32, ptr %getelementptr, align 8
  %add = add i32 %load3, %arg1
  store i32 %add, ptr %arg, align 4
  %call = call ptr @ham(ptr %alloca2)
  call void @pluto(ptr %alloca2, ptr %arg)
  %load4 = load i8, ptr %arg, align 1
  %trunc = trunc i8 %load4 to i1
  ret i1 %trunc
}

define ptr @snork(ptr %arg, i64 %arg1) {
bb:
  %alloca = alloca ptr, align 8
  %alloca2 = alloca i64, align 8
  store ptr %arg, ptr %alloca, align 8
  store i64 %arg1, ptr %alloca2, align 8
  %load = load ptr, ptr %alloca, align 8
  %load3 = load i64, ptr %alloca2, align 8
  %getelementptr = getelementptr [64 x i8], ptr %load, i64 0, i64 %load3
  ret ptr %getelementptr
}

define void @pluto(ptr %arg, ptr %arg1) {
bb:
  %alloca = alloca i32, align 4
  %load = load i32, ptr %arg1, align 4
  %sdiv = sdiv i32 %load, 8
  store i32 %sdiv, ptr %alloca, align 4
  %load2 = load i32, ptr %alloca, align 4
  %sext = sext i32 %load2 to i64
  %call = call ptr @foo(ptr %arg, i64 %sext)
  %load3 = load i8, ptr %call, align 1
  store i8 %load3, ptr %arg, align 1
  %load4 = load i8, ptr %arg, align 1
  store i8 %load4, ptr %arg1, align 1
  ret void
}

@fhahn
Copy link
Contributor Author

fhahn commented Dec 2, 2025

@alexfh thanks for the report, I think the underlying issue is similar to #168709 and got exposed by the patch. Now that the fix for #168709 i'll work on extending it to also cover the case from the reproducer

augusto2112 pushed a commit to augusto2112/llvm-project that referenced this pull request Dec 3, 2025
llvm#169249)

VPVector(End)PointerRecipes are single-scalar if all their operands are.
This should be effectively NFC currently, but it should re-enable cost
checking for some more VPWidenMemoryRecipe after
llvm#157387 as discovered by
John Brawn.
kcloudy0717 pushed a commit to kcloudy0717/llvm-project that referenced this pull request Dec 4, 2025
llvm#169249)

VPVector(End)PointerRecipes are single-scalar if all their operands are.
This should be effectively NFC currently, but it should re-enable cost
checking for some more VPWidenMemoryRecipe after
llvm#157387 as discovered by
John Brawn.
@alexfh
Copy link
Contributor

alexfh commented Dec 5, 2025

@alexfh thanks for the report, I think the underlying issue is similar to #168709 and got exposed by the patch. Now that the fix for #168709 i'll work on extending it to also cover the case from the reproducer

Thanks for the update! When do you expect to have a fix?

@alexfh
Copy link
Contributor

alexfh commented Dec 8, 2025

@fhahn a few more changes have landed on top of this one and it's become quite difficult for us to revert locally. At this point reverting this upstream is similarly non-trivial. Do you have an ETA for a fix?

Copy link
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have been fixed as of last week by 1054a6e. The reproducer is not crashing any longer on current main: https://gcc.godbolt.org/z/fxcdP379G

@alexfh
Copy link
Contributor

alexfh commented Dec 8, 2025

This should have been fixed as of last week by 1054a6e. The reproducer is not crashing any longer on current main: https://gcc.godbolt.org/z/fxcdP379G

Thank you for the update! It would be nice if the description of #170474 or at least the commit message for 1054a6e mentioned this issue. It's hard to keep track of all commits and guess, which one fixes which issue.

Thank you for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants