Skip to content

Conversation

@michaelmaitland
Copy link
Contributor

Similar to what we do for copies. We may reduce one of the PHI operands and not the other, and thats perfectly okay.

@llvmbot
Copy link
Member

llvmbot commented Mar 20, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Michael Maitland (michaelmaitland)

Changes

Similar to what we do for copies. We may reduce one of the PHI operands and not the other, and thats perfectly okay.


Full diff: https://github.com/llvm/llvm-project/pull/132236.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vl-opt.mir (+84)
diff --git a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
index c36a1e9adccb0..61f43e2e72b13 100644
--- a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
@@ -1340,6 +1340,13 @@ RISCVVLOptimizer::checkUsers(const MachineInstr &MI) const {
       continue;
     }
 
+    if (UserMI.isPHI()) {
+      LLVM_DEBUG(dbgs() << "    Peeking through uses of PHI\n");
+        for (auto &PhiUse : MRI->use_operands(UserMI.getOperand(0).getReg()))
+          Worklist.insert(&PhiUse);
+      continue;
+    }
+
     auto VLOp = getMinimumVLForUser(UserOp);
     if (!VLOp)
       return std::nullopt;
diff --git a/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir b/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
index d42feeca9dbcc..dfc4dd84c72b2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
@@ -368,3 +368,87 @@ body: |
     %y:vr = COPY %x
     %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 4 /* e16 */, 0 /* tu, mu */
 ...
+---
+name: phi
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: phi
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   BNE $noreg, $noreg, %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   %y:vr = PHI %w, %bb.0, %x, %bb.1
+  ; CHECK-NEXT:   %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  bb.0:
+    %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+    BNE $noreg, $noreg, %bb.2
+  bb.1:
+    %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+  bb.2:
+    %y:vr = PHI %w, %bb.0, %x, %bb.1
+    %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+...
+---
+name: phi_user_invalid_sew
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: phi_user_invalid_sew
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   BNE $noreg, $noreg, %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   %y:vr = PHI %w, %bb.0, %x, %bb.1
+  ; CHECK-NEXT:   %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 4 /* e16 */, 0 /* tu, mu */
+  bb.0:
+    %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+    BNE $noreg, $noreg, %bb.2
+  bb.1:
+    %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+  bb.2:
+    %y:vr = PHI %w, %bb.0, %x, %bb.1
+    %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 4 /* e16 */, 0 /* tu, mu */
+...
+---
+name: phi_different_incoming_sew
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: phi_different_incoming_sew
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   BNE $noreg, $noreg, %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 4 /* e16 */, 0 /* tu, mu */
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   %y:vr = PHI %w, %bb.0, %x, %bb.1
+  ; CHECK-NEXT:   %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  bb.0:
+    %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+    BNE $noreg, $noreg, %bb.2
+  bb.1:
+    %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 4 /* e16 */, 0 /* tu, mu */
+  bb.2:
+    %y:vr = PHI %w, %bb.0, %x, %bb.1
+    %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+...

@github-actions
Copy link

github-actions bot commented Mar 20, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@preames
Copy link
Collaborator

preames commented Mar 20, 2025

Similar to what we do for copies. We may reduce one of the PHI operands and not the other, and thats perfectly okay.

Unlike COPY, PHI allows the existence of cycles in the def/use graph. Usually, when traversing phis we need some kind of visited set to bailout if we get trapped in a cycle. Is there some non-obvious invariant in the code here which side steps that?

Classic test case would be something like:

%V = some_vector_instruction
%phi = phi [%V, %loop_header], [%phi, %loop]

Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, is there a specific workload you're seeing that motivates this?

@michaelmaitland
Copy link
Contributor Author

Nice, is there a specific workload you're seeing that motivates this?

We were experimenting with not using vp intrinsics and saw it in x264.

@michaelmaitland
Copy link
Contributor Author

@preames @topperc @lukel97 thanks for pointing this out. I've updated the patch to avoid an infinite loop. I think it is conservative in that case since the user in the loop body is causing us not to optimize. Maybe we can handle that in a follow up.

@lukel97 thanks for sharing your patch, I''ll take a closer look now.

@michaelmaitland michaelmaitland requested a review from lukel97 March 24, 2025 13:10
Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaelmaitland michaelmaitland merged commit f8416fc into llvm:main Mar 24, 2025
6 of 10 checks passed
@michaelmaitland michaelmaitland deleted the vlopt-phi branch March 24, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants