-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[RISCV][VLOPT] Look through PHI instructions #132236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-risc-v Author: Michael Maitland (michaelmaitland) ChangesSimilar to what we do for copies. We may reduce one of the PHI operands and not the other, and thats perfectly okay. Full diff: https://github.com/llvm/llvm-project/pull/132236.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
index c36a1e9adccb0..61f43e2e72b13 100644
--- a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
@@ -1340,6 +1340,13 @@ RISCVVLOptimizer::checkUsers(const MachineInstr &MI) const {
continue;
}
+ if (UserMI.isPHI()) {
+ LLVM_DEBUG(dbgs() << " Peeking through uses of PHI\n");
+ for (auto &PhiUse : MRI->use_operands(UserMI.getOperand(0).getReg()))
+ Worklist.insert(&PhiUse);
+ continue;
+ }
+
auto VLOp = getMinimumVLForUser(UserOp);
if (!VLOp)
return std::nullopt;
diff --git a/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir b/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
index d42feeca9dbcc..dfc4dd84c72b2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
@@ -368,3 +368,87 @@ body: |
%y:vr = COPY %x
%z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 4 /* e16 */, 0 /* tu, mu */
...
+---
+name: phi
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: phi
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+ ; CHECK-NEXT: BNE $noreg, $noreg, %bb.2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: %y:vr = PHI %w, %bb.0, %x, %bb.1
+ ; CHECK-NEXT: %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+ bb.0:
+ %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ BNE $noreg, $noreg, %bb.2
+ bb.1:
+ %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ bb.2:
+ %y:vr = PHI %w, %bb.0, %x, %bb.1
+ %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+...
+---
+name: phi_user_invalid_sew
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: phi_user_invalid_sew
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ ; CHECK-NEXT: BNE $noreg, $noreg, %bb.2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: %y:vr = PHI %w, %bb.0, %x, %bb.1
+ ; CHECK-NEXT: %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 4 /* e16 */, 0 /* tu, mu */
+ bb.0:
+ %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ BNE $noreg, $noreg, %bb.2
+ bb.1:
+ %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ bb.2:
+ %y:vr = PHI %w, %bb.0, %x, %bb.1
+ %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 4 /* e16 */, 0 /* tu, mu */
+...
+---
+name: phi_different_incoming_sew
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: phi_different_incoming_sew
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+ ; CHECK-NEXT: BNE $noreg, $noreg, %bb.2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 4 /* e16 */, 0 /* tu, mu */
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: %y:vr = PHI %w, %bb.0, %x, %bb.1
+ ; CHECK-NEXT: %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+ bb.0:
+ %w:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+ BNE $noreg, $noreg, %bb.2
+ bb.1:
+ %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 4 /* e16 */, 0 /* tu, mu */
+ bb.2:
+ %y:vr = PHI %w, %bb.0, %x, %bb.1
+ %z:vr = PseudoVADD_VV_M1 $noreg, %y, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+...
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Unlike COPY, PHI allows the existence of cycles in the def/use graph. Usually, when traversing phis we need some kind of visited set to bailout if we get trapped in a cycle. Is there some non-obvious invariant in the code here which side steps that? Classic test case would be something like: |
lukel97
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, is there a specific workload you're seeing that motivates this?
We were experimenting with not using vp intrinsics and saw it in x264. |
|
@preames @topperc @lukel97 thanks for pointing this out. I've updated the patch to avoid an infinite loop. I think it is conservative in that case since the user in the loop body is causing us not to optimize. Maybe we can handle that in a follow up. @lukel97 thanks for sharing your patch, I''ll take a closer look now. |
lukel97
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Similar to what we do for copies. We may reduce one of the PHI operands and not the other, and thats perfectly okay.