-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[StructurizeCFG] nested-if zerocost hoist bugfix #155408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When zero cost instructions are hoisted, the simplifyHoistedPhi function was setting incoming phi values which were not dominating the use causing runtime failure. This was set to poison by rebuildSSA function. This commit fixes the issue.
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-amdgpu Author: Vigneshwar Jayakumar (VigneshwarJ) ChangesWhen zero cost instructions are hoisted, the simplifyHoistedPhi function was setting incoming phi values which were not dominating the use causing runtime failure. This was set to poison by rebuildSSA function. This commit fixes the issue. Full diff: https://github.com/llvm/llvm-project/pull/155408.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
index bb7dbc2980f59..e05625344ee29 100644
--- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
@@ -997,7 +997,8 @@ void StructurizeCFG::simplifyHoistedPhis() {
continue;
OtherPhi->setIncomingValue(PoisonValBBIdx, V);
- Phi->setIncomingValue(i, OtherV);
+ if (DT->dominates(OtherV, Phi))
+ Phi->setIncomingValue(i, OtherV);
}
}
}
diff --git a/llvm/test/CodeGen/AMDGPU/structurize-hoist.ll b/llvm/test/CodeGen/AMDGPU/structurize-hoist.ll
index 42436a1b4c279..9bfb0a6ed408f 100644
--- a/llvm/test/CodeGen/AMDGPU/structurize-hoist.ll
+++ b/llvm/test/CodeGen/AMDGPU/structurize-hoist.ll
@@ -178,3 +178,56 @@ latch:
end:
ret void
}
+
+define void @test_nested_if(ptr %ptr, i32 %val, i1 %cond) {
+; GFX900-LABEL: test_nested_if:
+; GFX900: ; %bb.0: ; %entry
+; GFX900-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX900-NEXT: flat_load_dword v2, v[0:1]
+; GFX900-NEXT: v_and_b32_e32 v3, 1, v3
+; GFX900-NEXT: v_cmp_eq_u32_e64 s[4:5], 1, v3
+; GFX900-NEXT: s_mov_b64 s[8:9], -1
+; GFX900-NEXT: s_xor_b64 s[10:11], s[4:5], -1
+; GFX900-NEXT: s_and_saveexec_b64 s[6:7], s[10:11]
+; GFX900-NEXT: s_cbranch_execz .LBB3_4
+; GFX900-NEXT: ; %bb.1: ; %then
+; GFX900-NEXT: s_and_saveexec_b64 s[12:13], s[10:11]
+; GFX900-NEXT: s_cbranch_execz .LBB3_3
+; GFX900-NEXT: ; %bb.2: ; %then_2
+; GFX900-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX900-NEXT: flat_load_dword v2, v[0:1]
+; GFX900-NEXT: s_xor_b64 s[8:9], exec, -1
+; GFX900-NEXT: .LBB3_3: ; %Flow1
+; GFX900-NEXT: s_or_b64 exec, exec, s[12:13]
+; GFX900-NEXT: s_andn2_b64 s[4:5], s[4:5], exec
+; GFX900-NEXT: s_and_b64 s[8:9], s[8:9], exec
+; GFX900-NEXT: s_or_b64 s[4:5], s[4:5], s[8:9]
+; GFX900-NEXT: .LBB3_4: ; %Flow
+; GFX900-NEXT: s_or_b64 exec, exec, s[6:7]
+; GFX900-NEXT: s_and_saveexec_b64 s[6:7], s[4:5]
+; GFX900-NEXT: s_or_b64 exec, exec, s[6:7]
+; GFX900-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX900-NEXT: flat_store_dword v[0:1], v2
+; GFX900-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX900-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %load_then = load %pair, ptr %ptr
+ br i1 %cond, label %else, label %then
+
+then:
+ %a16 = icmp slt i32 %val, 255
+ br i1 %cond, label %else, label %then_2
+
+then_2:
+ %loaded = load i32, ptr %ptr
+ br label %merge
+
+else:
+ %a_else = extractvalue %pair %load_then, 0
+ br label %merge
+
+merge:
+ %phi = phi i32 [ %loaded, %then_2 ], [ %a_else, %else ]
+ store i32 %phi, ptr %ptr
+ ret void
+}
diff --git a/llvm/test/Transforms/StructurizeCFG/hoist-zerocost.ll b/llvm/test/Transforms/StructurizeCFG/hoist-zerocost.ll
index 10d4fa2be0a70..d084e199ceb89 100644
--- a/llvm/test/Transforms/StructurizeCFG/hoist-zerocost.ll
+++ b/llvm/test/Transforms/StructurizeCFG/hoist-zerocost.ll
@@ -159,3 +159,53 @@ latch:
end:
ret void
}
+
+define void @test_nested_if(ptr %ptr, i32 %val, i1 %cond) {
+; CHECK-LABEL: define void @test_nested_if(
+; CHECK-SAME: ptr [[PTR:%.*]], i32 [[VAL:%.*]], i1 [[COND:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[COND_INV:%.*]] = xor i1 [[COND]], true
+; CHECK-NEXT: [[LOAD_THEN:%.*]] = load [[PAIR:%.*]], ptr [[PTR]], align 4
+; CHECK-NEXT: [[A_ELSE:%.*]] = extractvalue [[PAIR]] [[LOAD_THEN]], 0
+; CHECK-NEXT: br i1 [[COND_INV]], label %[[THEN:.*]], label %[[FLOW:.*]]
+; CHECK: [[THEN]]:
+; CHECK-NEXT: [[A16:%.*]] = icmp slt i32 [[VAL]], 255
+; CHECK-NEXT: br i1 [[COND_INV]], label %[[THEN_2:.*]], label %[[FLOW1:.*]]
+; CHECK: [[FLOW]]:
+; CHECK-NEXT: [[TMP0:%.*]] = phi i32 [ [[TMP2:%.*]], %[[FLOW1]] ], [ [[A_ELSE]], %[[ENTRY]] ]
+; CHECK-NEXT: [[TMP1:%.*]] = phi i1 [ [[TMP3:%.*]], %[[FLOW1]] ], [ [[COND]], %[[ENTRY]] ]
+; CHECK-NEXT: br i1 [[TMP1]], label %[[ELSE:.*]], label %[[MERGE:.*]]
+; CHECK: [[THEN_2]]:
+; CHECK-NEXT: [[LOADED:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT: br label %[[FLOW1]]
+; CHECK: [[FLOW1]]:
+; CHECK-NEXT: [[TMP2]] = phi i32 [ [[LOADED]], %[[THEN_2]] ], [ [[A_ELSE]], %[[THEN]] ]
+; CHECK-NEXT: [[TMP3]] = phi i1 [ false, %[[THEN_2]] ], [ true, %[[THEN]] ]
+; CHECK-NEXT: br label %[[FLOW]]
+; CHECK: [[ELSE]]:
+; CHECK-NEXT: br label %[[MERGE]]
+; CHECK: [[MERGE]]:
+; CHECK-NEXT: store i32 [[TMP0]], ptr [[PTR]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ %load_then = load %pair, ptr %ptr
+ br i1 %cond, label %else, label %then
+
+then:
+ %a16 = icmp slt i32 %val, 255
+ br i1 %cond, label %else, label %then_2
+
+then_2:
+ %loaded = load i32, ptr %ptr
+ br label %merge
+
+else:
+ %a_else = extractvalue %pair %load_then, 0
+ br label %merge
+
+merge:
+ %phi = phi i32 [ %loaded, %then_2 ], [ %a_else, %else ]
+ store i32 %phi, ptr %ptr
+ ret void
+}
|
|
Can you run a full CQE cycle before landing this PR? StructurizeCFG is very sensitive to changes and can easily cause regressions in downstream testing. I suggest being proactive when touching it. |
|
@shiltian , This PR is to fix multiple bugs that appeared in staging. Do you still want me to do CQE on it as it may take weeks? |
|
Ideally, I think it’d be better to revert your previous change, combine it with this one, run a CQE cycle, and if everything comes back green, reland it. But I'll leave that up to you. |
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you precommit the test cases
Added nested-if tests for hoisitng zero cost instructions in StructurizeCFG pass. This is a patch with precommited tests for #155408
Added nested-if tests for hoisitng zero cost instructions in StructurizeCFG pass. This is a patch with precommited tests for llvm/llvm-project#155408
precommited the test cases |
When zero cost instructions are hoisted, the simplifyHoistedPhi function was setting incoming phi values which were not dominating the use causing runtime failure. This was set to poison by rebuildSSA function. This commit fixes the issue.