[IR] Mark convergence intrins as has-side-effect #134844

Keenuts · 2025-04-08T11:46:17Z

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

This can be solved in different ways:

modify DCE passes to keep calls marked as convergent.
add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic.
fix optimization passed removing the attribute by also removing the tokens from the call.

I picked the second option: mark those as hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.

When a callee is marked as `convergent`, some targets like HLSL/SPIR-V add a convergent token to the call. This is valid if both functions are marked as `convergent`. ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused. This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the `convergent` attribute from the callee. Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function. This can be solved in different ways: - modify DCE passes to keep calls marked as convergent. - add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic. - fix optimization passed removing the attribute by also removing the tokens from the call. I picked the second option: mark those as `hasSideEffects`, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics. An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed. I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.

llvmbot · 2025-04-08T11:46:51Z

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-llvm-ir

Author: Nathan Gauër (Keenuts)

Changes

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

This can be solved in different ways:

modify DCE passes to keep calls marked as convergent.
add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic.
fix optimization passed removing the attribute by also removing the tokens from the call.

I picked the second option: mark those as hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.

Patch is 43.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134844.diff

16 Files Affected:

(modified) clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl (+2-1)
(modified) clang/test/CodeGenHLSL/builtins/distance.hlsl (+8)
(modified) clang/test/CodeGenHLSL/builtins/length.hlsl (+8)
(modified) clang/test/CodeGenHLSL/builtins/reflect.hlsl (+8)
(modified) clang/test/CodeGenHLSL/builtins/smoothstep.hlsl (+8)
(modified) clang/test/CodeGenSPIRV/Builtins/distance.c (+3)
(modified) clang/test/CodeGenSPIRV/Builtins/length.c (+3)
(modified) clang/test/CodeGenSPIRV/Builtins/reflect.c (+3)
(modified) clang/test/CodeGenSPIRV/Builtins/smoothstep.c (+4)
(modified) llvm/include/llvm/IR/Intrinsics.td (+6-3)
(added) llvm/test/Transforms/ADCE/convergence.ll (+46)
(added) llvm/test/Transforms/BDCE/convergence.ll (+47)
(modified) llvm/test/Transforms/DCE/op_bundles.ll (+1)
(modified) llvm/test/Transforms/FunctionAttrs/convergent.ll (+15-1)
(modified) llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll (+4-4)
(modified) llvm/test/Transforms/LoopUnroll/convergent.controlled.ll (+8)

diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
index 56c523f6bc8cf..25062b8537aca 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s --check-prefixes=CHECK,SPIRV
 
 // All referenced to an unused resource should be removed by optimizations.
 RWBuffer<float> Buf : register(u5, space3);
@@ -10,6 +10,7 @@ void main() {
 // CHECK-NOT: resource.handlefrombinding
 // CHECK: define void @main()
 // CHECK-NEXT: entry:
+// SPIRV-NEXT: %0 = tail call token @llvm.experimental.convergence.entry()
 // CHECK-NEXT: ret void
 // CHECK-NOT: resource.handlefrombinding
 }
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..1d8f986bd12eb 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -16,6 +16,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z18test_distance_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[X:%.*]], half noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn half [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
@@ -33,6 +34,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -50,6 +52,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -67,6 +70,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -83,6 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
@@ -100,6 +105,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -117,6 +123,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -134,6 +141,7 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
diff --git a/clang/test/CodeGenHLSL/builtins/length.hlsl b/clang/test/CodeGenHLSL/builtins/length.hlsl
index 2d4bbd995298f..9597d33f70d62 100644
--- a/clang/test/CodeGenHLSL/builtins/length.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/length.hlsl
@@ -20,6 +20,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z16test_length_halfDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[P0]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
 //
@@ -42,6 +43,7 @@ half test_length_half(half p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half2Dv2_Dh(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -61,6 +63,7 @@ half test_length_half2(half2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half3Dv3_Dh(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -80,6 +83,7 @@ half test_length_half3(half3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half4Dv4_Dh(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -98,6 +102,7 @@ half test_length_half4(half4 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z17test_length_floatf(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[P0]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
 //
@@ -117,6 +122,7 @@ float test_length_float(float p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float2Dv2_f(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -136,6 +142,7 @@ float test_length_float2(float2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float3Dv3_f(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -155,6 +162,7 @@ float test_length_float3(float3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float4Dv4_f(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/reflect.hlsl b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
index 35ee059697c4b..3f1f653e0f0f9 100644
--- a/clang/test/CodeGenHLSL/builtins/reflect.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
@@ -18,6 +18,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_reflect_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[I:%.*]], half noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[I]], 0xH4000
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[TMP0]], [[MUL_I]]
@@ -42,6 +43,7 @@ half test_reflect_half(half I, half N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z18test_reflect_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[I:%.*]], <2 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.reflect.v2f16(<2 x half> [[I]], <2 x half> [[N]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_REFLECT_I]]
 //
@@ -63,6 +65,7 @@ half2 test_reflect_half2(half2 I, half2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z18test_reflect_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[I:%.*]], <3 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.reflect.v3f16(<3 x half> [[I]], <3 x half> [[N]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_REFLECT_I]]
 //
@@ -84,6 +87,7 @@ half3 test_reflect_half3(half3 I, half3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z18test_reflect_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[I:%.*]], <4 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x half> @llvm.spv.reflect.v4f16(<4 x half> [[I]], <4 x half> [[N]])
 // SPVCHECK-NEXT:    ret <4 x half> [[SPV_REFLECT_I]]
 //
@@ -103,6 +107,7 @@ half4 test_reflect_half4(half4 I, half4 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_reflect_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[I:%.*]], float noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[I]], 2.000000e+00
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP0]], [[MUL_I]]
@@ -127,6 +132,7 @@ float test_reflect_float(float I, float N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x float> @_Z19test_reflect_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[I:%.*]], <2 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x float> @llvm.spv.reflect.v2f32(<2 x float> [[I]], <2 x float> [[N]])
 // SPVCHECK-NEXT:    ret <2 x float> [[SPV_REFLECT_I]]
 //
@@ -148,6 +154,7 @@ float2 test_reflect_float2(float2 I, float2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x float> @_Z19test_reflect_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[I:%.*]], <3 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x float> @llvm.spv.reflect.v3f32(<3 x float> [[I]], <3 x float> [[N]])
 // SPVCHECK-NEXT:    ret <3 x float> [[SPV_REFLECT_I]]
 //
@@ -169,6 +176,7 @@ float3 test_reflect_float3(float3 I, float3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x float> @_Z19test_reflect_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[I:%.*]], <4 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x float> @llvm.spv.reflect.v4f32(<4 x float> [[I]], <4 x float> [[N]])
 // SPVCHECK-NEXT:    ret <4 x float> [[SPV_REFLECT_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
index f2328c7330e6c..de95c11a138e7 100644
--- a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
@@ -22,6 +22,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z20test_smoothstep_halfDhDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[MIN:%.*]], half noundef nofpclass(nan inf) [[MAX:%.*]], half noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.smoothstep.f16(half [[MIN]], half [[MAX]], half [[X]])
 // SPVCHECK-NEXT:    ret half [[SPV_SMOOTHSTEP_I]]
 //
@@ -43,6 +44,7 @@ half test_smoothstep_half(half Min, half Max, half X) { return smoothstep(Min, M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z21test_smoothstep_half2Dv2_DhS_S_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <2 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <2 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.smoothstep.v2f16(<2 x half> [[MIN]], <2 x half> [[MAX]], <2 x half> [[X]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -64,6 +66,7 @@ half2 test_smoothstep_half2(half2 Min, half2 Max, half2 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z21test_smoothstep_half3Dv3_DhS_S_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <3 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <3 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.smoothstep.v3f16(<3 x half> [[MIN]], <3 x half> [[MAX]], <3 x half> [[X]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -85,6 +88,7 @@ half3 test_smoothstep_half3(half3 Min, half3 Max, half3 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z21test_smoothstep_half4Dv4_DhS_S_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <4 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <4 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[E...
[truncated]

llvmbot · 2025-04-08T11:46:51Z

@llvm/pr-subscribers-backend-spir-v

Author: Nathan Gauër (Keenuts)

Changes

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

This can be solved in different ways:

modify DCE passes to keep calls marked as convergent.
add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic.
fix optimization passed removing the attribute by also removing the tokens from the call.

I picked the second option: mark those as hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.

Patch is 43.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134844.diff

16 Files Affected:

(modified) clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl (+2-1)
(modified) clang/test/CodeGenHLSL/builtins/distance.hlsl (+8)
(modified) clang/test/CodeGenHLSL/builtins/length.hlsl (+8)
(modified) clang/test/CodeGenHLSL/builtins/reflect.hlsl (+8)
(modified) clang/test/CodeGenHLSL/builtins/smoothstep.hlsl (+8)
(modified) clang/test/CodeGenSPIRV/Builtins/distance.c (+3)
(modified) clang/test/CodeGenSPIRV/Builtins/length.c (+3)
(modified) clang/test/CodeGenSPIRV/Builtins/reflect.c (+3)
(modified) clang/test/CodeGenSPIRV/Builtins/smoothstep.c (+4)
(modified) llvm/include/llvm/IR/Intrinsics.td (+6-3)
(added) llvm/test/Transforms/ADCE/convergence.ll (+46)
(added) llvm/test/Transforms/BDCE/convergence.ll (+47)
(modified) llvm/test/Transforms/DCE/op_bundles.ll (+1)
(modified) llvm/test/Transforms/FunctionAttrs/convergent.ll (+15-1)
(modified) llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll (+4-4)
(modified) llvm/test/Transforms/LoopUnroll/convergent.controlled.ll (+8)

diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
index 56c523f6bc8cf..25062b8537aca 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s --check-prefixes=CHECK,SPIRV
 
 // All referenced to an unused resource should be removed by optimizations.
 RWBuffer<float> Buf : register(u5, space3);
@@ -10,6 +10,7 @@ void main() {
 // CHECK-NOT: resource.handlefrombinding
 // CHECK: define void @main()
 // CHECK-NEXT: entry:
+// SPIRV-NEXT: %0 = tail call token @llvm.experimental.convergence.entry()
 // CHECK-NEXT: ret void
 // CHECK-NOT: resource.handlefrombinding
 }
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..1d8f986bd12eb 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -16,6 +16,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z18test_distance_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[X:%.*]], half noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn half [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
@@ -33,6 +34,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -50,6 +52,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -67,6 +70,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -83,6 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
@@ -100,6 +105,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -117,6 +123,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -134,6 +141,7 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
diff --git a/clang/test/CodeGenHLSL/builtins/length.hlsl b/clang/test/CodeGenHLSL/builtins/length.hlsl
index 2d4bbd995298f..9597d33f70d62 100644
--- a/clang/test/CodeGenHLSL/builtins/length.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/length.hlsl
@@ -20,6 +20,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z16test_length_halfDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[P0]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
 //
@@ -42,6 +43,7 @@ half test_length_half(half p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half2Dv2_Dh(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -61,6 +63,7 @@ half test_length_half2(half2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half3Dv3_Dh(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -80,6 +83,7 @@ half test_length_half3(half3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half4Dv4_Dh(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -98,6 +102,7 @@ half test_length_half4(half4 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z17test_length_floatf(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[P0]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
 //
@@ -117,6 +122,7 @@ float test_length_float(float p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float2Dv2_f(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -136,6 +142,7 @@ float test_length_float2(float2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float3Dv3_f(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -155,6 +162,7 @@ float test_length_float3(float3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float4Dv4_f(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/reflect.hlsl b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
index 35ee059697c4b..3f1f653e0f0f9 100644
--- a/clang/test/CodeGenHLSL/builtins/reflect.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
@@ -18,6 +18,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_reflect_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[I:%.*]], half noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[I]], 0xH4000
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[TMP0]], [[MUL_I]]
@@ -42,6 +43,7 @@ half test_reflect_half(half I, half N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z18test_reflect_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[I:%.*]], <2 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.reflect.v2f16(<2 x half> [[I]], <2 x half> [[N]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_REFLECT_I]]
 //
@@ -63,6 +65,7 @@ half2 test_reflect_half2(half2 I, half2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z18test_reflect_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[I:%.*]], <3 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.reflect.v3f16(<3 x half> [[I]], <3 x half> [[N]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_REFLECT_I]]
 //
@@ -84,6 +87,7 @@ half3 test_reflect_half3(half3 I, half3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z18test_reflect_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[I:%.*]], <4 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x half> @llvm.spv.reflect.v4f16(<4 x half> [[I]], <4 x half> [[N]])
 // SPVCHECK-NEXT:    ret <4 x half> [[SPV_REFLECT_I]]
 //
@@ -103,6 +107,7 @@ half4 test_reflect_half4(half4 I, half4 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_reflect_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[I:%.*]], float noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[I]], 2.000000e+00
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP0]], [[MUL_I]]
@@ -127,6 +132,7 @@ float test_reflect_float(float I, float N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x float> @_Z19test_reflect_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[I:%.*]], <2 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x float> @llvm.spv.reflect.v2f32(<2 x float> [[I]], <2 x float> [[N]])
 // SPVCHECK-NEXT:    ret <2 x float> [[SPV_REFLECT_I]]
 //
@@ -148,6 +154,7 @@ float2 test_reflect_float2(float2 I, float2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x float> @_Z19test_reflect_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[I:%.*]], <3 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x float> @llvm.spv.reflect.v3f32(<3 x float> [[I]], <3 x float> [[N]])
 // SPVCHECK-NEXT:    ret <3 x float> [[SPV_REFLECT_I]]
 //
@@ -169,6 +176,7 @@ float3 test_reflect_float3(float3 I, float3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x float> @_Z19test_reflect_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[I:%.*]], <4 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x float> @llvm.spv.reflect.v4f32(<4 x float> [[I]], <4 x float> [[N]])
 // SPVCHECK-NEXT:    ret <4 x float> [[SPV_REFLECT_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
index f2328c7330e6c..de95c11a138e7 100644
--- a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
@@ -22,6 +22,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z20test_smoothstep_halfDhDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[MIN:%.*]], half noundef nofpclass(nan inf) [[MAX:%.*]], half noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.smoothstep.f16(half [[MIN]], half [[MAX]], half [[X]])
 // SPVCHECK-NEXT:    ret half [[SPV_SMOOTHSTEP_I]]
 //
@@ -43,6 +44,7 @@ half test_smoothstep_half(half Min, half Max, half X) { return smoothstep(Min, M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z21test_smoothstep_half2Dv2_DhS_S_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <2 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <2 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.smoothstep.v2f16(<2 x half> [[MIN]], <2 x half> [[MAX]], <2 x half> [[X]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -64,6 +66,7 @@ half2 test_smoothstep_half2(half2 Min, half2 Max, half2 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z21test_smoothstep_half3Dv3_DhS_S_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <3 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <3 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.smoothstep.v3f16(<3 x half> [[MIN]], <3 x half> [[MAX]], <3 x half> [[X]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -85,6 +88,7 @@ half3 test_smoothstep_half3(half3 Min, half3 Max, half3 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z21test_smoothstep_half4Dv4_DhS_S_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <4 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <4 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[E...
[truncated]

arsenm

These should not have side effects. The problem you are experiencing is because the design of the convergent attribute is broken. It imposes a structural requirement on the IR, rather than relaxing an assumed restriction. Whatever is removing the convergent needs to consider the uses (or more likely, just leave it alone)

We should make convergent the IR default, and add noconvergent as an assertion that no convergent operations are used. And it becomes UB if a convergent operation ends up used in convergent code. I was working on https://reviews.llvm.org/D69498 and would like to get back to it

Keenuts · 2025-04-08T12:11:46Z

These should not have side effects.

I agree I'm stretching the "hasSideEffect" definition here.

Whatever is removing the convergent needs to consider the uses (or more likely, just leave it alone)

I suspect the long-term change to change the default IR to assume convergent will take some time as it will impact many subprojects. Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead? It would solve this issue in practice, allowing us to move forward with addrspace cast legalization.

arsenm · 2025-04-08T13:13:31Z

I suspect the long-term change to change the default IR to assume convergent will take some time as it will impact many subprojects.

Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement

Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead? It would solve this issue in practice, allowing us to move forward with addrspace cast legalization.

This is still trying to fix this in a roundabout way. You should be stopping the strip of the convergent attribute, not the intrinsic uses that happen to be in the function

Keenuts · 2025-04-08T15:34:12Z

Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement

Looking at the PR you linked, seems like there was still not a clear consensus on the default change no? (And I'd assume consumers like llvm-translator won't be too happy about this breaking change no?)

This is still trying to fix this in a roundabout way. You should be stopping the strip of the convergent attribute, not the intrinsic uses that happen to be in the function

Fair enough, sent another PR which handles this by fixing the FunctionAttr pass, preventing removal of the convergent attr in those cases.

arsenm · 2025-04-08T15:47:36Z

Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement

Looking at the PR you linked, seems like there was still not a clear consensus on the default change no?

Yes, there were people who are wrong and need to find the time to push this forward.

(And I'd assume consumers like llvm-translator won't be too happy about this breaking change no?)

It's not a breaking change, it's a theoretical loss of optimization by default in unanalyzable call situations.

Even if we leave the wart of not fixing wrong by default, we still need a noconvergent attribute

ssahasra · 2025-04-09T10:31:26Z

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

I didn't understand the validity part. Why is the caller required to be convergent in order to add a token to a callsite?

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused. This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.

Seems right to me.

Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

Seems correct again.

Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead?

Why is DCE dropping convergence intrinsics? Is it because it cannot see the operand bundles as legit uses?

ssahasra · 2025-04-09T10:35:27Z

From the spec for convergence control tokens:
https://llvm.org/docs/ConvergentOperations.html#inferring-non-convergence

An optimizer may remove the convergent attribute on a function if it can prove that the function does not contain a call to llvm.experimental.convergence.entry, or any uncontrolled convergent operations.

Keenuts · 2025-04-09T11:10:45Z

FYI: there is this PR which I think will replace this one: #134863

I didn't understand the validity part. Why is the caller required to be convergent in order to add a token to a callsite?

Given this example:

declare i32 @foo()

define i32 @bar(i32 %a) convergent {
  %tk = call token @llvm.experimental.convergence.entry()
  %rs = call i32 @foo(i32 %a)
  ret i32 %rs
}

define i32 @baz(i32 %a) convergent {
  %tk = call token @llvm.experimental.convergence.entry()
  %rs = call i32 @bar(i32 %a) [ "convergencectrl"(token %tk) ]
  ret i32 %rs
}

foo is not marked as convergent, but only declared. Nothing to do.
DCE runs on bar, which has a call to a convergence intrinsic. But its value is not used, hence considers this instruction as dead, and removes it.
Now, FunctionAttr runs, and only sees 2 instructions: call + ret. The call is calling the non-convergent foo function, hence the pass removes the convergent attribute from the function.

Now, baz has a call with a convergence token, to a non-convergent function. THis triggers a validation error: Convergence control token can only be used in a convergent call.

Keenuts · 2025-04-09T11:14:11Z

miss-typed enter, sent the comment above too early.

There are multiple ways to solve this issue:

fix the FunctionAttr to run DCE again once the convergent attribute is removed to make sure those invalid cases don't happen.
fix the FunctionAttr to not remove convergent if the function is used in a convergent call (the linked PR)
fix DCE to now remove convergence intrinsics, thus preventing FunctionAttr to wrongly remove the convergence attribute because the presence on such intrinsic is a sign the function needs it.

arsenm · 2025-04-09T13:27:17Z

Thinking we should remove the IR verification requirement. It's more of a lint type check and violations would be UB

ssahasra · 2025-04-10T03:56:39Z

Thanks for the example! I think ~~133684~~ #134863 is a correct incremental step forward, but maybe just going the whole way bottom-up is better. It is also correct to remove the verifier check. It's not a well-formedness error or even a semantic error to pass a convergence control token to a non-convergent call. It is also not UB. The only outcome is that the user will not get the convergence they expected, and this is not even observable since there are no convergent operations that will be affected by this unused token.

@nhaehnle does this look okay to you?

ssahasra · 2025-04-10T04:04:42Z

To take this to its logical conclusion, when convergence tokens are in use, the convergent attribute is redundant. All we need is a noconvergent attribute for function declarations. A function definition is convergent iff the body contains a call to the entry intrinsic, and a function declaration is assumed to be convergent unless it has a noconvergent attribute on it. So the immediate impact on the specification is that everywhere including the verifier, it is always okay to ignore the convergent attribute on a function definition if convergence tokens are in use.

Keenuts · 2025-04-10T07:30:37Z

Thanks for the example! I think ~~133684~~ #134863 is a correct incremental step forward

I suppose the PR number you mention is the wrong one?

Keenuts · 2025-04-10T07:33:58Z

It is also correct to remove the verifier check. It's not a well-formedness error or even a semantic error to pass a convergence control token to a non-convergent call. It is also not UB. The only outcome is that the user will not get the convergence they expected, and this is not even observable since there are no convergent operations that will be affected by this unused token.

Thanks for the details!

Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in llvm#134863 and llvm#134844.

Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in #134863 and #134844.

Keenuts · 2025-05-02T08:18:23Z

Closed in favor of #135794

Before this commit, having a convergence token on a non-convergent call was considered to be an error. This commit relaxes this requirement and allows convergence tokens to be present on non-convergent calls. When such token is present, they have no effect as the underlying call is non-convergent. This allows passes like DCE to strip `convergent` attribute from functions for which all convergent operations have been stripped. When this happens, a convergence token can still exist in the call-site, causing the verifier to complain. Alternatives have been considered in llvm#134863 and llvm#134844.

Keenuts requested review from arsenm and s-perron April 8, 2025 11:46

llvmbot added clang Clang issues not falling into any other category HLSL HLSL Language Support backend:SPIR-V llvm:ir llvm:transforms labels Apr 8, 2025

github-project-automation bot added this to HLSL Support Apr 8, 2025

Keenuts requested a review from ssahasra April 8, 2025 11:48

arsenm requested changes Apr 8, 2025

View reviewed changes

ssahasra mentioned this pull request Apr 11, 2025

[IPO] Prevent removal of some convergent attr #134863

Closed

Keenuts mentioned this pull request Apr 15, 2025

[IR] Relax convergence requirements on call #135794

Merged

s-perron removed their request for review April 16, 2025 18:34

Keenuts closed this May 2, 2025

github-project-automation bot moved this to Closed in HLSL Support May 2, 2025

damyanp removed this from HLSL Support Jun 25, 2025

Keenuts deleted the convergent-side-effect branch September 9, 2025 15:11

[IR] Mark convergence intrins as has-side-effect #134844

[IR] Mark convergence intrins as has-side-effect #134844

Uh oh!

Conversation

Keenuts commented Apr 8, 2025

Uh oh!

llvmbot commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Apr 8, 2025

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Keenuts commented Apr 8, 2025

Uh oh!

arsenm commented Apr 8, 2025

Uh oh!

Keenuts commented Apr 8, 2025

Uh oh!

arsenm commented Apr 8, 2025

Uh oh!

ssahasra commented Apr 9, 2025

Uh oh!

ssahasra commented Apr 9, 2025

Uh oh!

Keenuts commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Keenuts commented Apr 9, 2025

Uh oh!

arsenm commented Apr 9, 2025

Uh oh!

ssahasra commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssahasra commented Apr 10, 2025

Uh oh!

Keenuts commented Apr 10, 2025 • edited by MaskRay Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Keenuts commented Apr 10, 2025

Uh oh!

Keenuts commented May 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented Apr 8, 2025 •

edited

Loading

Keenuts commented Apr 9, 2025 •

edited

Loading

ssahasra commented Apr 10, 2025 •

edited

Loading

Keenuts commented Apr 10, 2025 •

edited by MaskRay

Loading