Skip to content

Conversation

@badumbatish
Copy link
Contributor

@badumbatish badumbatish commented Jul 7, 2025

Change isBuildVectorAll* -> isConstantSplatVectorAll* in VSelect in case the fold happens after BuildVector has been canonically transformed to Splat or if the Splat is initially in vselect already

@llvmbot
Copy link
Member

llvmbot commented Jul 7, 2025

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-webassembly

Author: jjasmine (badumbatish)

Changes

[WASM] Fold bitselect with argument <0>

Fixes #73454
Fold bitselect with the splat <0> in either first or second
argument.

  • For first argument <0>: vselect <0>, X, Y -> Y
  • For second argument <0>: vselect Y, <0>, X -> AND(<X>, !<Y>)
  • For third argument <0>: vselect X, Y, <0> -> AND(<Y>, <X>)

Detailed explanation in the implementation.


Full diff: https://github.com/llvm/llvm-project/pull/147305.diff

2 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+63)
  • (added) llvm/test/CodeGen/WebAssembly/simd-bitselect.ll (+48)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index bf2e04caa0a61..8009bfcc861c3 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -191,6 +191,9 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
     // Combine vector mask reductions into alltrue/anytrue
     setTargetDAGCombine(ISD::SETCC);
 
+    // Convert vselect of various zero arguments to AND
+    setTargetDAGCombine(ISD::VSELECT);
+
     // Convert vector to integer bitcasts to bitmask
     setTargetDAGCombine(ISD::BITCAST);
 
@@ -3210,6 +3213,64 @@ static SDValue performTruncateCombine(SDNode *N,
   return truncateVectorWithNARROW(OutVT, In, DL, DAG);
 }
 
+static SDValue performVSelectCombine(SDNode *N, SelectionDAG &DAG) {
+  // In the tablegen.td, vselect A B C -> bitselect B C A
+
+  // SCENARIO A
+  // vselect <0>, X, Y
+  // -> bitselect X, Y, <0>
+  // -> or (AND(X, <0>), AND(<Y>, !<0>))
+  // -> or (0, AND(<Y>, !<0>))
+  // -> AND(Y, !<0>)
+  // -> AND(Y, 1)
+  // -> Y
+
+  // SCENARIO B
+  // vselect Y, <0>, X
+  // -> bitselect <0>, X, Y
+  // -> or (AND(<0>, Y), AND(<X>, !<Y>))
+  // -> or (0, AND(<X>, !<Y>))
+  // -> AND(<X>, !<Y>)
+
+  // SCENARIO C
+  // vselect X, Y, <0>
+  // -> bitselect Y, <0>, X
+  // -> or (AND(Y, X), AND(<0>, !X))
+  // -> or (AND(Y, X), <0>)
+  // -> AND(Y, X)
+
+  using namespace llvm::SDPatternMatch;
+  assert(N->getOpcode() == ISD::VSELECT);
+
+  SDLoc DL(N);
+
+  SDValue Cond = N->getOperand(0), LHS = N->getOperand(1),
+          RHS = N->getOperand(2);
+  EVT NVT = N->getValueType(0);
+
+  APInt SplatValue;
+
+  // SCENARIO A
+  if (ISD::isConstantSplatVector(Cond.getNode(), SplatValue) &&
+      SplatValue.isZero())
+    return RHS;
+
+  // SCENARIO B
+  if (ISD::isConstantSplatVector(LHS.getNode(), SplatValue) &&
+      SplatValue.isZero())
+    return DAG.getNode(
+        ISD::AND, DL, NVT,
+        {RHS, DAG.getSExtOrTrunc(DAG.getNOT(DL, Cond, Cond.getValueType()), DL,
+                                 NVT)});
+
+  // SCENARIO C
+  if (ISD::isConstantSplatVector(RHS.getNode(), SplatValue) &&
+      SplatValue.isZero())
+    return DAG.getNode(ISD::AND, DL, NVT,
+                       {LHS, DAG.getSExtOrTrunc(Cond, DL, NVT)});
+  return SDValue();
+}
+
 static SDValue performBitcastCombine(SDNode *N,
                                      TargetLowering::DAGCombinerInfo &DCI) {
   using namespace llvm::SDPatternMatch;
@@ -3505,6 +3566,8 @@ WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N,
   switch (N->getOpcode()) {
   default:
     return SDValue();
+  case ISD::VSELECT:
+    return performVSelectCombine(N, DCI.DAG);
   case ISD::BITCAST:
     return performBitcastCombine(N, DCI);
   case ISD::SETCC:
diff --git a/llvm/test/CodeGen/WebAssembly/simd-bitselect.ll b/llvm/test/CodeGen/WebAssembly/simd-bitselect.ll
new file mode 100644
index 0000000000000..7803709dc8b4a
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/simd-bitselect.ll
@@ -0,0 +1,48 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s  -O3 -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s
+target triple = "wasm32-unknown-unknown"
+
+define void @bitselect_first_zero(ptr %output, ptr  %input) {
+; CHECK-LABEL: bitselect_first_zero:
+; CHECK:         .functype bitselect_first_zero (i32, i32) -> ()
+; CHECK-NEXT:  # %bb.0: # %start
+; CHECK-NEXT:    v128.load $push6=, 0($1)
+; CHECK-NEXT:    local.tee $push5=, $2=, $pop6
+; CHECK-NEXT:    v128.const $push0=, 2139095040, 2139095040, 2139095040, 2139095040
+; CHECK-NEXT:    v128.and $push1=, $2, $pop0
+; CHECK-NEXT:    v128.const $push2=, 0, 0, 0, 0
+; CHECK-NEXT:    i32x4.ne $push3=, $pop1, $pop2
+; CHECK-NEXT:    v128.and $push4=, $pop5, $pop3
+; CHECK-NEXT:    v128.store 0($0), $pop4
+; CHECK-NEXT:    return
+start:
+  %input.val = load <4 x i32>, ptr %input, align 16
+  %0 = and <4 x i32> %input.val, splat (i32 2139095040)
+  %1 = icmp eq <4 x i32> %0, zeroinitializer
+  %2 = select <4 x i1> %1, <4 x i32> zeroinitializer, <4 x i32> %input.val
+  store <4 x i32> %2, ptr %output, align 16
+  ret void
+}
+
+
+define void @bitselect_second_zero(ptr %output, ptr %input) {
+; CHECK-LABEL: bitselect_second_zero:
+; CHECK:         .functype bitselect_second_zero (i32, i32) -> ()
+; CHECK-NEXT:  # %bb.0: # %start
+; CHECK-NEXT:    v128.load $push6=, 0($1)
+; CHECK-NEXT:    local.tee $push5=, $2=, $pop6
+; CHECK-NEXT:    v128.const $push0=, 2139095040, 2139095040, 2139095040, 2139095040
+; CHECK-NEXT:    v128.and $push1=, $2, $pop0
+; CHECK-NEXT:    v128.const $push2=, 0, 0, 0, 0
+; CHECK-NEXT:    i32x4.eq $push3=, $pop1, $pop2
+; CHECK-NEXT:    v128.and $push4=, $pop5, $pop3
+; CHECK-NEXT:    v128.store 0($0), $pop4
+; CHECK-NEXT:    return
+start:
+  %input.val = load <4 x i32>, ptr %input, align 16
+  %0 = and <4 x i32> %input.val, splat (i32 2139095040)
+  %1 = icmp eq <4 x i32> %0, zeroinitializer
+  %2 = select  <4 x i1> %1, <4 x i32> %input.val, <4 x i32> zeroinitializer
+  store <4 x i32> %2, ptr %output, align 16
+  ret void
+}

@badumbatish
Copy link
Contributor Author

waiting for CI to pass before tagging

@badumbatish
Copy link
Contributor Author

sorry about the conflict, I didn't catch that locally, will repush

@lukel97
Copy link
Contributor

lukel97 commented Jul 9, 2025

The title of the PR and description probably need to be changed since this isn't really a wasm PR anymore, it's more to do with DAGCombiner

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please can you rebase after #147472 - I think what you're after is just the undef element handling that isConstantSplatVector gives you by default

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think wasm is missing an hasAndNot override

@lukel97
Copy link
Contributor

lukel97 commented Jul 9, 2025

please can you rebase after #147472 - I think what you're after is just the undef element handling that isConstantSplatVector gives you by default

I think it's the fact that build_vectors seem to be canonicalzied to splat_vectors, and combineVSelectWithAllOnesOrZeroes only seems to pick up build_vectors because it calls these:

bool ISD::isBuildVectorAllOnes(const SDNode *N) {
  return isConstantSplatVectorAllOnes(N, /*BuildVectorOnly*/ true);
}

bool ISD::isBuildVectorAllZeros(const SDNode *N) {
  return isConstantSplatVectorAllZeros(N, /*BuildVectorOnly*/ true);
}

Also, I think wasm is missing an hasAndNot override

Yes, that's exactly what we're looking for the second part of this PR on lines R13161. @badumbatish although wasm doesn't seem to have an andnot instruction, it might be worthwhile exploring if returning true is profitable for vectors anyway.

@badumbatish
Copy link
Contributor Author

Ah this is unfortunate, I didn't realize the other PR was opened at the time as well. Will probably rebase edit before the mass change in tests

@badumbatish badumbatish force-pushed the bitselect_v128_zero_simpl branch from 3dc890b to b3e58ea Compare July 9, 2025 19:48
@badumbatish badumbatish changed the title [WASM] Fold bitselect with splat zero [DAGCombine] Fold vselect with splat zero Jul 9, 2025
@badumbatish
Copy link
Contributor Author

although wasm doesn't seem to have an andnot instruction, it might be worthwhile exploring if returning true is profitable for vectors anyway.

I think the current impl checks if the scalar bit size is equal before folding to avoid having extra sign extending. I'm not sure if this is enough.

Precommit test for isConstantSplatVectorAll in VSelect
- Use isConstantSplatVectorAll* in VSelect
- Update tests to reflect this
@badumbatish badumbatish force-pushed the bitselect_v128_zero_simpl branch from b3e58ea to 1ad59c0 Compare July 10, 2025 03:49
Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! Just some comments on the test

@lukel97 lukel97 requested review from RKSimon and rj-jesus July 10, 2025 06:40
@RKSimon
Copy link
Collaborator

RKSimon commented Jul 10, 2025

Please can you update the summary

Copy link
Contributor

@rj-jesus rj-jesus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look at others' comments, but otherwise LGTM, thanks!

@badumbatish
Copy link
Contributor Author

will push nit fix and update summary

@badumbatish badumbatish changed the title [DAGCombine] Fold vselect with splat zero [DAGCombine] Change isBuildVectorAll* -> isConstantSplatVectorAll* for Vselect Jul 10, 2025
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@badumbatish
Copy link
Contributor Author

@RKSimon I've filed for commit access for now but it'd be great if you can help me merge it, tysm!

@RKSimon RKSimon merged commit 44481f5 into llvm:main Jul 11, 2025
9 checks passed
@badumbatish badumbatish deleted the bitselect_v128_zero_simpl branch July 11, 2025 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WASM: v128.bitselect When Argument is Zeroed Can Be Simplified

5 participants