-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[IR] Mark convergence intrins as has-side-effect #134844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When a callee is marked as `convergent`, some targets like HLSL/SPIR-V add a convergent token to the call. This is valid if both functions are marked as `convergent`. ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused. This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the `convergent` attribute from the callee. Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function. This can be solved in different ways: - modify DCE passes to keep calls marked as convergent. - add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic. - fix optimization passed removing the attribute by also removing the tokens from the call. I picked the second option: mark those as `hasSideEffects`, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics. An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed. I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-ir Author: Nathan Gauër (Keenuts) ChangesWhen a callee is marked as ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused. This can be solved in different ways:
I picked the second option: mark those as An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed. I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops. Patch is 43.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134844.diff 16 Files Affected:
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
index 56c523f6bc8cf..25062b8537aca 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
@@ -1,5 +1,5 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s --check-prefixes=CHECK,SPIRV
// All referenced to an unused resource should be removed by optimizations.
RWBuffer<float> Buf : register(u5, space3);
@@ -10,6 +10,7 @@ void main() {
// CHECK-NOT: resource.handlefrombinding
// CHECK: define void @main()
// CHECK-NEXT: entry:
+// SPIRV-NEXT: %0 = tail call token @llvm.experimental.convergence.entry()
// CHECK-NEXT: ret void
// CHECK-NOT: resource.handlefrombinding
}
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..1d8f986bd12eb 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -16,6 +16,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z18test_distance_halfDhDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[X:%.*]], half noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn half [[X]], [[Y]]
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
// SPVCHECK-NEXT: ret half [[ELT_ABS_I]]
@@ -33,6 +34,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x half> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
@@ -50,6 +52,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x half> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
@@ -67,6 +70,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x half> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
@@ -83,6 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
// SPVCHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[X]], [[Y]]
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
// SPVCHECK-NEXT: ret float [[ELT_ABS_I]]
@@ -100,6 +105,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
// SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x float> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
@@ -117,6 +123,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
// SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x float> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
@@ -134,6 +141,7 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
// SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x float> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
diff --git a/clang/test/CodeGenHLSL/builtins/length.hlsl b/clang/test/CodeGenHLSL/builtins/length.hlsl
index 2d4bbd995298f..9597d33f70d62 100644
--- a/clang/test/CodeGenHLSL/builtins/length.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/length.hlsl
@@ -20,6 +20,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z16test_length_halfDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[P0]])
// SPVCHECK-NEXT: ret half [[ELT_ABS_I]]
//
@@ -42,6 +43,7 @@ half test_length_half(half p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half2Dv2_Dh(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[P0]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
//
@@ -61,6 +63,7 @@ half test_length_half2(half2 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half3Dv3_Dh(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[P0]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
//
@@ -80,6 +83,7 @@ half test_length_half3(half3 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half4Dv4_Dh(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[P0]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
//
@@ -98,6 +102,7 @@ half test_length_half4(half4 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z17test_length_floatf(
// SPVCHECK-SAME: float noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[P0]])
// SPVCHECK-NEXT: ret float [[ELT_ABS_I]]
//
@@ -117,6 +122,7 @@ float test_length_float(float p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float2Dv2_f(
// SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[P0]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
//
@@ -136,6 +142,7 @@ float test_length_float2(float2 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float3Dv3_f(
// SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[P0]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
//
@@ -155,6 +162,7 @@ float test_length_float3(float3 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float4Dv4_f(
// SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[P0]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
//
diff --git a/clang/test/CodeGenHLSL/builtins/reflect.hlsl b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
index 35ee059697c4b..3f1f653e0f0f9 100644
--- a/clang/test/CodeGenHLSL/builtins/reflect.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
@@ -18,6 +18,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_reflect_halfDhDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[I:%.*]], half noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[I]], 0xH4000
// SPVCHECK-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[N]], [[N]]
// SPVCHECK-NEXT: [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[TMP0]], [[MUL_I]]
@@ -42,6 +43,7 @@ half test_reflect_half(half I, half N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z18test_reflect_half2Dv2_DhS_(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[I:%.*]], <2 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.reflect.v2f16(<2 x half> [[I]], <2 x half> [[N]])
// SPVCHECK-NEXT: ret <2 x half> [[SPV_REFLECT_I]]
//
@@ -63,6 +65,7 @@ half2 test_reflect_half2(half2 I, half2 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z18test_reflect_half3Dv3_DhS_(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[I:%.*]], <3 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.reflect.v3f16(<3 x half> [[I]], <3 x half> [[N]])
// SPVCHECK-NEXT: ret <3 x half> [[SPV_REFLECT_I]]
//
@@ -84,6 +87,7 @@ half3 test_reflect_half3(half3 I, half3 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z18test_reflect_half4Dv4_DhS_(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[I:%.*]], <4 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x half> @llvm.spv.reflect.v4f16(<4 x half> [[I]], <4 x half> [[N]])
// SPVCHECK-NEXT: ret <4 x half> [[SPV_REFLECT_I]]
//
@@ -103,6 +107,7 @@ half4 test_reflect_half4(half4 I, half4 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_reflect_floatff(
// SPVCHECK-SAME: float noundef nofpclass(nan inf) [[I:%.*]], float noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[I]], 2.000000e+00
// SPVCHECK-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[N]], [[N]]
// SPVCHECK-NEXT: [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP0]], [[MUL_I]]
@@ -127,6 +132,7 @@ float test_reflect_float(float I, float N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x float> @_Z19test_reflect_float2Dv2_fS_(
// SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[I:%.*]], <2 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x float> @llvm.spv.reflect.v2f32(<2 x float> [[I]], <2 x float> [[N]])
// SPVCHECK-NEXT: ret <2 x float> [[SPV_REFLECT_I]]
//
@@ -148,6 +154,7 @@ float2 test_reflect_float2(float2 I, float2 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x float> @_Z19test_reflect_float3Dv3_fS_(
// SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[I:%.*]], <3 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x float> @llvm.spv.reflect.v3f32(<3 x float> [[I]], <3 x float> [[N]])
// SPVCHECK-NEXT: ret <3 x float> [[SPV_REFLECT_I]]
//
@@ -169,6 +176,7 @@ float3 test_reflect_float3(float3 I, float3 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x float> @_Z19test_reflect_float4Dv4_fS_(
// SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[I:%.*]], <4 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x float> @llvm.spv.reflect.v4f32(<4 x float> [[I]], <4 x float> [[N]])
// SPVCHECK-NEXT: ret <4 x float> [[SPV_REFLECT_I]]
//
diff --git a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
index f2328c7330e6c..de95c11a138e7 100644
--- a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
@@ -22,6 +22,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z20test_smoothstep_halfDhDhDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[MIN:%.*]], half noundef nofpclass(nan inf) [[MAX:%.*]], half noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.smoothstep.f16(half [[MIN]], half [[MAX]], half [[X]])
// SPVCHECK-NEXT: ret half [[SPV_SMOOTHSTEP_I]]
//
@@ -43,6 +44,7 @@ half test_smoothstep_half(half Min, half Max, half X) { return smoothstep(Min, M
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z21test_smoothstep_half2Dv2_DhS_S_(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <2 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <2 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.smoothstep.v2f16(<2 x half> [[MIN]], <2 x half> [[MAX]], <2 x half> [[X]])
// SPVCHECK-NEXT: ret <2 x half> [[SPV_SMOOTHSTEP_I]]
//
@@ -64,6 +66,7 @@ half2 test_smoothstep_half2(half2 Min, half2 Max, half2 X) { return smoothstep(M
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z21test_smoothstep_half3Dv3_DhS_S_(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <3 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <3 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.smoothstep.v3f16(<3 x half> [[MIN]], <3 x half> [[MAX]], <3 x half> [[X]])
// SPVCHECK-NEXT: ret <3 x half> [[SPV_SMOOTHSTEP_I]]
//
@@ -85,6 +88,7 @@ half3 test_smoothstep_half3(half3 Min, half3 Max, half3 X) { return smoothstep(M
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z21test_smoothstep_half4Dv4_DhS_S_(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <4 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <4 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[E...
[truncated]
|
|
@llvm/pr-subscribers-backend-spir-v Author: Nathan Gauër (Keenuts) ChangesWhen a callee is marked as ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused. This can be solved in different ways:
I picked the second option: mark those as An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed. I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops. Patch is 43.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134844.diff 16 Files Affected:
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
index 56c523f6bc8cf..25062b8537aca 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
@@ -1,5 +1,5 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s --check-prefixes=CHECK,SPIRV
// All referenced to an unused resource should be removed by optimizations.
RWBuffer<float> Buf : register(u5, space3);
@@ -10,6 +10,7 @@ void main() {
// CHECK-NOT: resource.handlefrombinding
// CHECK: define void @main()
// CHECK-NEXT: entry:
+// SPIRV-NEXT: %0 = tail call token @llvm.experimental.convergence.entry()
// CHECK-NEXT: ret void
// CHECK-NOT: resource.handlefrombinding
}
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..1d8f986bd12eb 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -16,6 +16,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z18test_distance_halfDhDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[X:%.*]], half noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn half [[X]], [[Y]]
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
// SPVCHECK-NEXT: ret half [[ELT_ABS_I]]
@@ -33,6 +34,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x half> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
@@ -50,6 +52,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x half> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
@@ -67,6 +70,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x half> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
@@ -83,6 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
// SPVCHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[X]], [[Y]]
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
// SPVCHECK-NEXT: ret float [[ELT_ABS_I]]
@@ -100,6 +105,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
// SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x float> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
@@ -117,6 +123,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
// SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x float> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
@@ -134,6 +141,7 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
// SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x float> [[X]], [[Y]]
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
diff --git a/clang/test/CodeGenHLSL/builtins/length.hlsl b/clang/test/CodeGenHLSL/builtins/length.hlsl
index 2d4bbd995298f..9597d33f70d62 100644
--- a/clang/test/CodeGenHLSL/builtins/length.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/length.hlsl
@@ -20,6 +20,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z16test_length_halfDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[P0]])
// SPVCHECK-NEXT: ret half [[ELT_ABS_I]]
//
@@ -42,6 +43,7 @@ half test_length_half(half p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half2Dv2_Dh(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[P0]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
//
@@ -61,6 +63,7 @@ half test_length_half2(half2 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half3Dv3_Dh(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[P0]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
//
@@ -80,6 +83,7 @@ half test_length_half3(half3 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half4Dv4_Dh(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[P0]])
// SPVCHECK-NEXT: ret half [[SPV_LENGTH_I]]
//
@@ -98,6 +102,7 @@ half test_length_half4(half4 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z17test_length_floatf(
// SPVCHECK-SAME: float noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[P0]])
// SPVCHECK-NEXT: ret float [[ELT_ABS_I]]
//
@@ -117,6 +122,7 @@ float test_length_float(float p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float2Dv2_f(
// SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[P0]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
//
@@ -136,6 +142,7 @@ float test_length_float2(float2 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float3Dv3_f(
// SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[P0]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
//
@@ -155,6 +162,7 @@ float test_length_float3(float3 p0)
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float4Dv4_f(
// SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[P0]])
// SPVCHECK-NEXT: ret float [[SPV_LENGTH_I]]
//
diff --git a/clang/test/CodeGenHLSL/builtins/reflect.hlsl b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
index 35ee059697c4b..3f1f653e0f0f9 100644
--- a/clang/test/CodeGenHLSL/builtins/reflect.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
@@ -18,6 +18,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_reflect_halfDhDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[I:%.*]], half noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[I]], 0xH4000
// SPVCHECK-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[N]], [[N]]
// SPVCHECK-NEXT: [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[TMP0]], [[MUL_I]]
@@ -42,6 +43,7 @@ half test_reflect_half(half I, half N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z18test_reflect_half2Dv2_DhS_(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[I:%.*]], <2 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.reflect.v2f16(<2 x half> [[I]], <2 x half> [[N]])
// SPVCHECK-NEXT: ret <2 x half> [[SPV_REFLECT_I]]
//
@@ -63,6 +65,7 @@ half2 test_reflect_half2(half2 I, half2 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z18test_reflect_half3Dv3_DhS_(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[I:%.*]], <3 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.reflect.v3f16(<3 x half> [[I]], <3 x half> [[N]])
// SPVCHECK-NEXT: ret <3 x half> [[SPV_REFLECT_I]]
//
@@ -84,6 +87,7 @@ half3 test_reflect_half3(half3 I, half3 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z18test_reflect_half4Dv4_DhS_(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[I:%.*]], <4 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x half> @llvm.spv.reflect.v4f16(<4 x half> [[I]], <4 x half> [[N]])
// SPVCHECK-NEXT: ret <4 x half> [[SPV_REFLECT_I]]
//
@@ -103,6 +107,7 @@ half4 test_reflect_half4(half4 I, half4 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_reflect_floatff(
// SPVCHECK-SAME: float noundef nofpclass(nan inf) [[I:%.*]], float noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[I]], 2.000000e+00
// SPVCHECK-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[N]], [[N]]
// SPVCHECK-NEXT: [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP0]], [[MUL_I]]
@@ -127,6 +132,7 @@ float test_reflect_float(float I, float N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x float> @_Z19test_reflect_float2Dv2_fS_(
// SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[I:%.*]], <2 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x float> @llvm.spv.reflect.v2f32(<2 x float> [[I]], <2 x float> [[N]])
// SPVCHECK-NEXT: ret <2 x float> [[SPV_REFLECT_I]]
//
@@ -148,6 +154,7 @@ float2 test_reflect_float2(float2 I, float2 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x float> @_Z19test_reflect_float3Dv3_fS_(
// SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[I:%.*]], <3 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x float> @llvm.spv.reflect.v3f32(<3 x float> [[I]], <3 x float> [[N]])
// SPVCHECK-NEXT: ret <3 x float> [[SPV_REFLECT_I]]
//
@@ -169,6 +176,7 @@ float3 test_reflect_float3(float3 I, float3 N) {
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x float> @_Z19test_reflect_float4Dv4_fS_(
// SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[I:%.*]], <4 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x float> @llvm.spv.reflect.v4f32(<4 x float> [[I]], <4 x float> [[N]])
// SPVCHECK-NEXT: ret <4 x float> [[SPV_REFLECT_I]]
//
diff --git a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
index f2328c7330e6c..de95c11a138e7 100644
--- a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
@@ -22,6 +22,7 @@
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z20test_smoothstep_halfDhDhDh(
// SPVCHECK-SAME: half noundef nofpclass(nan inf) [[MIN:%.*]], half noundef nofpclass(nan inf) [[MAX:%.*]], half noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.smoothstep.f16(half [[MIN]], half [[MAX]], half [[X]])
// SPVCHECK-NEXT: ret half [[SPV_SMOOTHSTEP_I]]
//
@@ -43,6 +44,7 @@ half test_smoothstep_half(half Min, half Max, half X) { return smoothstep(Min, M
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z21test_smoothstep_half2Dv2_DhS_S_(
// SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <2 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <2 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.smoothstep.v2f16(<2 x half> [[MIN]], <2 x half> [[MAX]], <2 x half> [[X]])
// SPVCHECK-NEXT: ret <2 x half> [[SPV_SMOOTHSTEP_I]]
//
@@ -64,6 +66,7 @@ half2 test_smoothstep_half2(half2 Min, half2 Max, half2 X) { return smoothstep(M
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z21test_smoothstep_half3Dv3_DhS_S_(
// SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <3 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <3 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[ENTRY:.*:]]
+// SPVCHECK-NEXT: {{.*}} = tail call token @llvm.experimental.convergence.entry()
// SPVCHECK-NEXT: [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.smoothstep.v3f16(<3 x half> [[MIN]], <3 x half> [[MAX]], <3 x half> [[X]])
// SPVCHECK-NEXT: ret <3 x half> [[SPV_SMOOTHSTEP_I]]
//
@@ -85,6 +88,7 @@ half3 test_smoothstep_half3(half3 Min, half3 Max, half3 X) { return smoothstep(M
// SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z21test_smoothstep_half4Dv4_DhS_S_(
// SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <4 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <4 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SPVCHECK-NEXT: [[E...
[truncated]
|
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should not have side effects. The problem you are experiencing is because the design of the convergent attribute is broken. It imposes a structural requirement on the IR, rather than relaxing an assumed restriction. Whatever is removing the convergent needs to consider the uses (or more likely, just leave it alone)
We should make convergent the IR default, and add noconvergent as an assertion that no convergent operations are used. And it becomes UB if a convergent operation ends up used in convergent code. I was working on https://reviews.llvm.org/D69498 and would like to get back to it
I agree I'm stretching the "hasSideEffect" definition here.
I suspect the long-term change to change the default IR to assume convergent will take some time as it will impact many subprojects. Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead? It would solve this issue in practice, allowing us to move forward with addrspace cast legalization. |
Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement
This is still trying to fix this in a roundabout way. You should be stopping the strip of the convergent attribute, not the intrinsic uses that happen to be in the function |
Looking at the PR you linked, seems like there was still not a clear consensus on the default change no? (And I'd assume consumers like llvm-translator won't be too happy about this breaking change no?)
Fair enough, sent another PR which handles this by fixing the FunctionAttr pass, preventing removal of the convergent attr in those cases. |
Yes, there were people who are wrong and need to find the time to push this forward.
It's not a breaking change, it's a theoretical loss of optimization by default in unanalyzable call situations. Even if we leave the wart of not fixing wrong by default, we still need a noconvergent attribute |
I didn't understand the validity part. Why is the caller required to be convergent in order to add a token to a callsite?
Seems right to me.
Seems correct again.
Why is DCE dropping convergence intrinsics? Is it because it cannot see the operand bundles as legit uses? |
|
From the spec for convergence control tokens:
|
|
FYI: there is this PR which I think will replace this one: #134863
Given this example: declare i32 @foo()
define i32 @bar(i32 %a) convergent {
%tk = call token @llvm.experimental.convergence.entry()
%rs = call i32 @foo(i32 %a)
ret i32 %rs
}
define i32 @baz(i32 %a) convergent {
%tk = call token @llvm.experimental.convergence.entry()
%rs = call i32 @bar(i32 %a) [ "convergencectrl"(token %tk) ]
ret i32 %rs
}
Now, |
|
miss-typed enter, sent the comment above too early. There are multiple ways to solve this issue:
|
|
Thinking we should remove the IR verification requirement. It's more of a lint type check and violations would be UB |
|
Thanks for the example! I think @nhaehnle does this look okay to you? |
|
To take this to its logical conclusion, when convergence tokens are in use, the |
I suppose the PR number you mention is the wrong one? |
Thanks for the details! |
Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in llvm#134863 and llvm#134844.
Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in #134863 and #134844.
|
Closed in favor of #135794 |
Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in llvm#134863 and llvm#134844.
Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in llvm#134863 and llvm#134844.
When a callee is marked as
convergent, some targets like HLSL/SPIR-V add a convergent token to the call.This is valid if both functions are marked as
convergent.ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the
convergentattribute from the callee.Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.
This can be solved in different ways:
I picked the second option: mark those as
hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.
I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.