Skip to content

Conversation

@Keenuts
Copy link
Contributor

@Keenuts Keenuts commented Apr 8, 2025

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

This can be solved in different ways:

  • modify DCE passes to keep calls marked as convergent.
  • add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic.
  • fix optimization passed removing the attribute by also removing the tokens from the call.

I picked the second option: mark those as hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.

When a callee is marked as `convergent`, some targets like HLSL/SPIR-V
add a convergent token to the call.
This is valid if both functions are marked as `convergent`.

ADCE/BDCE and other DCE passes were allowed to remove convergence
intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics.
This would allow further optimization to remove the `convergent`
attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token
attached to a call function calling a non-convergent function.

This can be solved in different ways:
 - modify DCE passes to keep calls marked as convergent.
 - add a 'hasSideEffect' to the intrinsics definition, preventing DCE to
   remove the intrinsic.
 - fix optimization passed removing the attribute by also removing the
   tokens from the call.

I picked the second option: mark those as `hasSideEffects`, because
convergence intrinsics presence impact the divergence/reconvergence
location of the IR, hence their simple presence has an implicit meaning.
This is particularly important for SPIR-V as the control flow shape will
depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the
convergence intrinsics. After all, they are important, but the widening
the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly
from the convergence intrinsics presence to correctly structurize loops.
@Keenuts Keenuts requested review from arsenm and s-perron April 8, 2025 11:46
@llvmbot llvmbot added clang Clang issues not falling into any other category HLSL HLSL Language Support backend:SPIR-V llvm:ir llvm:transforms labels Apr 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 8, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-llvm-ir

Author: Nathan Gauër (Keenuts)

Changes

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

This can be solved in different ways:

  • modify DCE passes to keep calls marked as convergent.
  • add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic.
  • fix optimization passed removing the attribute by also removing the tokens from the call.

I picked the second option: mark those as hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.


Patch is 43.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134844.diff

16 Files Affected:

  • (modified) clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl (+2-1)
  • (modified) clang/test/CodeGenHLSL/builtins/distance.hlsl (+8)
  • (modified) clang/test/CodeGenHLSL/builtins/length.hlsl (+8)
  • (modified) clang/test/CodeGenHLSL/builtins/reflect.hlsl (+8)
  • (modified) clang/test/CodeGenHLSL/builtins/smoothstep.hlsl (+8)
  • (modified) clang/test/CodeGenSPIRV/Builtins/distance.c (+3)
  • (modified) clang/test/CodeGenSPIRV/Builtins/length.c (+3)
  • (modified) clang/test/CodeGenSPIRV/Builtins/reflect.c (+3)
  • (modified) clang/test/CodeGenSPIRV/Builtins/smoothstep.c (+4)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+6-3)
  • (added) llvm/test/Transforms/ADCE/convergence.ll (+46)
  • (added) llvm/test/Transforms/BDCE/convergence.ll (+47)
  • (modified) llvm/test/Transforms/DCE/op_bundles.ll (+1)
  • (modified) llvm/test/Transforms/FunctionAttrs/convergent.ll (+15-1)
  • (modified) llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopUnroll/convergent.controlled.ll (+8)
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
index 56c523f6bc8cf..25062b8537aca 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s --check-prefixes=CHECK,SPIRV
 
 // All referenced to an unused resource should be removed by optimizations.
 RWBuffer<float> Buf : register(u5, space3);
@@ -10,6 +10,7 @@ void main() {
 // CHECK-NOT: resource.handlefrombinding
 // CHECK: define void @main()
 // CHECK-NEXT: entry:
+// SPIRV-NEXT: %0 = tail call token @llvm.experimental.convergence.entry()
 // CHECK-NEXT: ret void
 // CHECK-NOT: resource.handlefrombinding
 }
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..1d8f986bd12eb 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -16,6 +16,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z18test_distance_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[X:%.*]], half noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn half [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
@@ -33,6 +34,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -50,6 +52,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -67,6 +70,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -83,6 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
@@ -100,6 +105,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -117,6 +123,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -134,6 +141,7 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
diff --git a/clang/test/CodeGenHLSL/builtins/length.hlsl b/clang/test/CodeGenHLSL/builtins/length.hlsl
index 2d4bbd995298f..9597d33f70d62 100644
--- a/clang/test/CodeGenHLSL/builtins/length.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/length.hlsl
@@ -20,6 +20,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z16test_length_halfDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[P0]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
 //
@@ -42,6 +43,7 @@ half test_length_half(half p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half2Dv2_Dh(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -61,6 +63,7 @@ half test_length_half2(half2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half3Dv3_Dh(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -80,6 +83,7 @@ half test_length_half3(half3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half4Dv4_Dh(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -98,6 +102,7 @@ half test_length_half4(half4 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z17test_length_floatf(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[P0]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
 //
@@ -117,6 +122,7 @@ float test_length_float(float p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float2Dv2_f(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -136,6 +142,7 @@ float test_length_float2(float2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float3Dv3_f(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -155,6 +162,7 @@ float test_length_float3(float3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float4Dv4_f(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/reflect.hlsl b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
index 35ee059697c4b..3f1f653e0f0f9 100644
--- a/clang/test/CodeGenHLSL/builtins/reflect.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
@@ -18,6 +18,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_reflect_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[I:%.*]], half noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[I]], 0xH4000
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[TMP0]], [[MUL_I]]
@@ -42,6 +43,7 @@ half test_reflect_half(half I, half N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z18test_reflect_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[I:%.*]], <2 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.reflect.v2f16(<2 x half> [[I]], <2 x half> [[N]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_REFLECT_I]]
 //
@@ -63,6 +65,7 @@ half2 test_reflect_half2(half2 I, half2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z18test_reflect_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[I:%.*]], <3 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.reflect.v3f16(<3 x half> [[I]], <3 x half> [[N]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_REFLECT_I]]
 //
@@ -84,6 +87,7 @@ half3 test_reflect_half3(half3 I, half3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z18test_reflect_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[I:%.*]], <4 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x half> @llvm.spv.reflect.v4f16(<4 x half> [[I]], <4 x half> [[N]])
 // SPVCHECK-NEXT:    ret <4 x half> [[SPV_REFLECT_I]]
 //
@@ -103,6 +107,7 @@ half4 test_reflect_half4(half4 I, half4 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_reflect_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[I:%.*]], float noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[I]], 2.000000e+00
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP0]], [[MUL_I]]
@@ -127,6 +132,7 @@ float test_reflect_float(float I, float N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x float> @_Z19test_reflect_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[I:%.*]], <2 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x float> @llvm.spv.reflect.v2f32(<2 x float> [[I]], <2 x float> [[N]])
 // SPVCHECK-NEXT:    ret <2 x float> [[SPV_REFLECT_I]]
 //
@@ -148,6 +154,7 @@ float2 test_reflect_float2(float2 I, float2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x float> @_Z19test_reflect_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[I:%.*]], <3 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x float> @llvm.spv.reflect.v3f32(<3 x float> [[I]], <3 x float> [[N]])
 // SPVCHECK-NEXT:    ret <3 x float> [[SPV_REFLECT_I]]
 //
@@ -169,6 +176,7 @@ float3 test_reflect_float3(float3 I, float3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x float> @_Z19test_reflect_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[I:%.*]], <4 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x float> @llvm.spv.reflect.v4f32(<4 x float> [[I]], <4 x float> [[N]])
 // SPVCHECK-NEXT:    ret <4 x float> [[SPV_REFLECT_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
index f2328c7330e6c..de95c11a138e7 100644
--- a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
@@ -22,6 +22,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z20test_smoothstep_halfDhDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[MIN:%.*]], half noundef nofpclass(nan inf) [[MAX:%.*]], half noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.smoothstep.f16(half [[MIN]], half [[MAX]], half [[X]])
 // SPVCHECK-NEXT:    ret half [[SPV_SMOOTHSTEP_I]]
 //
@@ -43,6 +44,7 @@ half test_smoothstep_half(half Min, half Max, half X) { return smoothstep(Min, M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z21test_smoothstep_half2Dv2_DhS_S_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <2 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <2 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.smoothstep.v2f16(<2 x half> [[MIN]], <2 x half> [[MAX]], <2 x half> [[X]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -64,6 +66,7 @@ half2 test_smoothstep_half2(half2 Min, half2 Max, half2 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z21test_smoothstep_half3Dv3_DhS_S_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <3 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <3 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.smoothstep.v3f16(<3 x half> [[MIN]], <3 x half> [[MAX]], <3 x half> [[X]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -85,6 +88,7 @@ half3 test_smoothstep_half3(half3 Min, half3 Max, half3 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z21test_smoothstep_half4Dv4_DhS_S_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <4 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <4 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[E...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 8, 2025

@llvm/pr-subscribers-backend-spir-v

Author: Nathan Gauër (Keenuts)

Changes

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused.
This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.
Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

This can be solved in different ways:

  • modify DCE passes to keep calls marked as convergent.
  • add a 'hasSideEffect' to the intrinsics definition, preventing DCE to remove the intrinsic.
  • fix optimization passed removing the attribute by also removing the tokens from the call.

I picked the second option: mark those as hasSideEffects, because convergence intrinsics presence impact the divergence/reconvergence location of the IR, hence their simple presence has an implicit meaning. This is particularly important for SPIR-V as the control flow shape will depend on the presence of those intrinsics.

An alternative I could agree to modifying each pass to keep the convergence intrinsics. After all, they are important, but the widening the 'hadSideEffect' meaning could be discussed.

I however would be against the 3rd solution: SPIR-V benefits greatly from the convergence intrinsics presence to correctly structurize loops.


Patch is 43.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134844.diff

16 Files Affected:

  • (modified) clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl (+2-1)
  • (modified) clang/test/CodeGenHLSL/builtins/distance.hlsl (+8)
  • (modified) clang/test/CodeGenHLSL/builtins/length.hlsl (+8)
  • (modified) clang/test/CodeGenHLSL/builtins/reflect.hlsl (+8)
  • (modified) clang/test/CodeGenHLSL/builtins/smoothstep.hlsl (+8)
  • (modified) clang/test/CodeGenSPIRV/Builtins/distance.c (+3)
  • (modified) clang/test/CodeGenSPIRV/Builtins/length.c (+3)
  • (modified) clang/test/CodeGenSPIRV/Builtins/reflect.c (+3)
  • (modified) clang/test/CodeGenSPIRV/Builtins/smoothstep.c (+4)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+6-3)
  • (added) llvm/test/Transforms/ADCE/convergence.ll (+46)
  • (added) llvm/test/Transforms/BDCE/convergence.ll (+47)
  • (modified) llvm/test/Transforms/DCE/op_bundles.ll (+1)
  • (modified) llvm/test/Transforms/FunctionAttrs/convergent.ll (+15-1)
  • (modified) llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopUnroll/convergent.controlled.ll (+8)
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
index 56c523f6bc8cf..25062b8537aca 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-constructor-opt.hlsl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple spirv-vulkan-compute -x hlsl -emit-llvm -O3 -o - %s | FileCheck %s --check-prefixes=CHECK,SPIRV
 
 // All referenced to an unused resource should be removed by optimizations.
 RWBuffer<float> Buf : register(u5, space3);
@@ -10,6 +10,7 @@ void main() {
 // CHECK-NOT: resource.handlefrombinding
 // CHECK: define void @main()
 // CHECK-NEXT: entry:
+// SPIRV-NEXT: %0 = tail call token @llvm.experimental.convergence.entry()
 // CHECK-NEXT: ret void
 // CHECK-NOT: resource.handlefrombinding
 }
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..1d8f986bd12eb 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -16,6 +16,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z18test_distance_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[X:%.*]], half noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn half [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
@@ -33,6 +34,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -50,6 +52,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -67,6 +70,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x half> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
@@ -83,6 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
@@ -100,6 +105,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <2 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -117,6 +123,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <3 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
@@ -134,6 +141,7 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SUB_I:%.*]] = fsub reassoc nnan ninf nsz arcp afn <4 x float> [[X]], [[Y]]
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
diff --git a/clang/test/CodeGenHLSL/builtins/length.hlsl b/clang/test/CodeGenHLSL/builtins/length.hlsl
index 2d4bbd995298f..9597d33f70d62 100644
--- a/clang/test/CodeGenHLSL/builtins/length.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/length.hlsl
@@ -20,6 +20,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z16test_length_halfDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[P0]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
 //
@@ -42,6 +43,7 @@ half test_length_half(half p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half2Dv2_Dh(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -61,6 +63,7 @@ half test_length_half2(half2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half3Dv3_Dh(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -80,6 +83,7 @@ half test_length_half3(half3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_length_half4Dv4_Dh(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[P0]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
@@ -98,6 +102,7 @@ half test_length_half4(half4 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z17test_length_floatf(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[P0]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
 //
@@ -117,6 +122,7 @@ float test_length_float(float p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float2Dv2_f(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -136,6 +142,7 @@ float test_length_float2(float2 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float3Dv3_f(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
@@ -155,6 +162,7 @@ float test_length_float3(float3 p0)
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_length_float4Dv4_f(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P0:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[P0]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/reflect.hlsl b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
index 35ee059697c4b..3f1f653e0f0f9 100644
--- a/clang/test/CodeGenHLSL/builtins/reflect.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/reflect.hlsl
@@ -18,6 +18,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z17test_reflect_halfDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[I:%.*]], half noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[I]], 0xH4000
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[TMP0]], [[MUL_I]]
@@ -42,6 +43,7 @@ half test_reflect_half(half I, half N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z18test_reflect_half2Dv2_DhS_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[I:%.*]], <2 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.reflect.v2f16(<2 x half> [[I]], <2 x half> [[N]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_REFLECT_I]]
 //
@@ -63,6 +65,7 @@ half2 test_reflect_half2(half2 I, half2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z18test_reflect_half3Dv3_DhS_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[I:%.*]], <3 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.reflect.v3f16(<3 x half> [[I]], <3 x half> [[N]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_REFLECT_I]]
 //
@@ -84,6 +87,7 @@ half3 test_reflect_half3(half3 I, half3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z18test_reflect_half4Dv4_DhS_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[I:%.*]], <4 x half> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x half> @llvm.spv.reflect.v4f16(<4 x half> [[I]], <4 x half> [[N]])
 // SPVCHECK-NEXT:    ret <4 x half> [[SPV_REFLECT_I]]
 //
@@ -103,6 +107,7 @@ half4 test_reflect_half4(half4 I, half4 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) float @_Z18test_reflect_floatff(
 // SPVCHECK-SAME: float noundef nofpclass(nan inf) [[I:%.*]], float noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[MUL_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[I]], 2.000000e+00
 // SPVCHECK-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[N]], [[N]]
 // SPVCHECK-NEXT:    [[MUL2_I:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP0]], [[MUL_I]]
@@ -127,6 +132,7 @@ float test_reflect_float(float I, float N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x float> @_Z19test_reflect_float2Dv2_fS_(
 // SPVCHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[I:%.*]], <2 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x float> @llvm.spv.reflect.v2f32(<2 x float> [[I]], <2 x float> [[N]])
 // SPVCHECK-NEXT:    ret <2 x float> [[SPV_REFLECT_I]]
 //
@@ -148,6 +154,7 @@ float2 test_reflect_float2(float2 I, float2 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x float> @_Z19test_reflect_float3Dv3_fS_(
 // SPVCHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[I:%.*]], <3 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x float> @llvm.spv.reflect.v3f32(<3 x float> [[I]], <3 x float> [[N]])
 // SPVCHECK-NEXT:    ret <3 x float> [[SPV_REFLECT_I]]
 //
@@ -169,6 +176,7 @@ float3 test_reflect_float3(float3 I, float3 N) {
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x float> @_Z19test_reflect_float4Dv4_fS_(
 // SPVCHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[I:%.*]], <4 x float> noundef nofpclass(nan inf) [[N:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_REFLECT_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <4 x float> @llvm.spv.reflect.v4f32(<4 x float> [[I]], <4 x float> [[N]])
 // SPVCHECK-NEXT:    ret <4 x float> [[SPV_REFLECT_I]]
 //
diff --git a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
index f2328c7330e6c..de95c11a138e7 100644
--- a/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/smoothstep.hlsl
@@ -22,6 +22,7 @@
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) half @_Z20test_smoothstep_halfDhDhDh(
 // SPVCHECK-SAME: half noundef nofpclass(nan inf) [[MIN:%.*]], half noundef nofpclass(nan inf) [[MAX:%.*]], half noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.smoothstep.f16(half [[MIN]], half [[MAX]], half [[X]])
 // SPVCHECK-NEXT:    ret half [[SPV_SMOOTHSTEP_I]]
 //
@@ -43,6 +44,7 @@ half test_smoothstep_half(half Min, half Max, half X) { return smoothstep(Min, M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <2 x half> @_Z21test_smoothstep_half2Dv2_DhS_S_(
 // SPVCHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <2 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <2 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <2 x half> @llvm.spv.smoothstep.v2f16(<2 x half> [[MIN]], <2 x half> [[MAX]], <2 x half> [[X]])
 // SPVCHECK-NEXT:    ret <2 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -64,6 +66,7 @@ half2 test_smoothstep_half2(half2 Min, half2 Max, half2 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <3 x half> @_Z21test_smoothstep_half3Dv3_DhS_S_(
 // SPVCHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <3 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <3 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[ENTRY:.*:]]
+// SPVCHECK-NEXT:    {{.*}} = tail call token @llvm.experimental.convergence.entry()
 // SPVCHECK-NEXT:    [[SPV_SMOOTHSTEP_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef <3 x half> @llvm.spv.smoothstep.v3f16(<3 x half> [[MIN]], <3 x half> [[MAX]], <3 x half> [[X]])
 // SPVCHECK-NEXT:    ret <3 x half> [[SPV_SMOOTHSTEP_I]]
 //
@@ -85,6 +88,7 @@ half3 test_smoothstep_half3(half3 Min, half3 Max, half3 X) { return smoothstep(M
 // SPVCHECK-LABEL: define spir_func noundef nofpclass(nan inf) <4 x half> @_Z21test_smoothstep_half4Dv4_DhS_S_(
 // SPVCHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[MIN:%.*]], <4 x half> noundef nofpclass(nan inf) [[MAX:%.*]], <4 x half> noundef nofpclass(nan inf) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SPVCHECK-NEXT:  [[E...
[truncated]

@Keenuts Keenuts requested a review from ssahasra April 8, 2025 11:48
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should not have side effects. The problem you are experiencing is because the design of the convergent attribute is broken. It imposes a structural requirement on the IR, rather than relaxing an assumed restriction. Whatever is removing the convergent needs to consider the uses (or more likely, just leave it alone)

We should make convergent the IR default, and add noconvergent as an assertion that no convergent operations are used. And it becomes UB if a convergent operation ends up used in convergent code. I was working on https://reviews.llvm.org/D69498 and would like to get back to it

@Keenuts
Copy link
Contributor Author

Keenuts commented Apr 8, 2025

These should not have side effects.

I agree I'm stretching the "hasSideEffect" definition here.

Whatever is removing the convergent needs to consider the uses (or more likely, just leave it alone)

I suspect the long-term change to change the default IR to assume convergent will take some time as it will impact many subprojects. Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead? It would solve this issue in practice, allowing us to move forward with addrspace cast legalization.

@arsenm
Copy link
Contributor

arsenm commented Apr 8, 2025

I suspect the long-term change to change the default IR to assume convergent will take some time as it will impact many subprojects.

Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement

Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead? It would solve this issue in practice, allowing us to move forward with addrspace cast legalization.

This is still trying to fix this in a roundabout way. You should be stopping the strip of the convergent attribute, not the intrinsic uses that happen to be in the function

@Keenuts
Copy link
Contributor Author

Keenuts commented Apr 8, 2025

Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement

Looking at the PR you linked, seems like there was still not a clear consensus on the default change no? (And I'd assume consumers like llvm-translator won't be too happy about this breaking change no?)

This is still trying to fix this in a roundabout way. You should be stopping the strip of the convergent attribute, not the intrinsic uses that happen to be in the function

Fair enough, sent another PR which handles this by fixing the FunctionAttr pass, preventing removal of the convergent attr in those cases.

@arsenm
Copy link
Contributor

arsenm commented Apr 8, 2025

Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement

Looking at the PR you linked, seems like there was still not a clear consensus on the default change no?

Yes, there were people who are wrong and need to find the time to push this forward.

(And I'd assume consumers like llvm-translator won't be too happy about this breaking change no?)

It's not a breaking change, it's a theoretical loss of optimization by default in unanalyzable call situations.

Even if we leave the wart of not fixing wrong by default, we still need a noconvergent attribute

@ssahasra
Copy link
Collaborator

ssahasra commented Apr 9, 2025

When a callee is marked as convergent, some targets like HLSL/SPIR-V add a convergent token to the call.
This is valid if both functions are marked as convergent.

I didn't understand the validity part. Why is the caller required to be convergent in order to add a token to a callsite?

ADCE/BDCE and other DCE passes were allowed to remove convergence intrinsics when their token were unused. This meant a leaf function could lose all its convergence intrinsics. This would allow further optimization to remove the convergent attribute from the callee.

Seems right to me.

Issue was the caller was not updated, and we now had a convergence token attached to a call function calling a non-convergent function.

Seems correct again.

Would you be OK with me patching the several DCE functions to not drop convergence intrinsics instead?

Why is DCE dropping convergence intrinsics? Is it because it cannot see the operand bundles as legit uses?

@ssahasra
Copy link
Collaborator

ssahasra commented Apr 9, 2025

From the spec for convergence control tokens:
https://llvm.org/docs/ConvergentOperations.html#inferring-non-convergence

An optimizer may remove the convergent attribute on a function if it can prove that the function does not contain a call to llvm.experimental.convergence.entry, or any uncontrolled convergent operations.

@Keenuts
Copy link
Contributor Author

Keenuts commented Apr 9, 2025

FYI: there is this PR which I think will replace this one: #134863

I didn't understand the validity part. Why is the caller required to be convergent in order to add a token to a callsite?

Given this example:

declare i32 @foo()

define i32 @bar(i32 %a) convergent {
  %tk = call token @llvm.experimental.convergence.entry()
  %rs = call i32 @foo(i32 %a)
  ret i32 %rs
}

define i32 @baz(i32 %a) convergent {
  %tk = call token @llvm.experimental.convergence.entry()
  %rs = call i32 @bar(i32 %a) [ "convergencectrl"(token %tk) ]
  ret i32 %rs
}

foo is not marked as convergent, but only declared. Nothing to do.
DCE runs on bar, which has a call to a convergence intrinsic. But its value is not used, hence considers this instruction as dead, and removes it.
Now, FunctionAttr runs, and only sees 2 instructions: call + ret. The call is calling the non-convergent foo function, hence the pass removes the convergent attribute from the function.

Now, baz has a call with a convergence token, to a non-convergent function. THis triggers a validation error: Convergence control token can only be used in a convergent call.

@Keenuts
Copy link
Contributor Author

Keenuts commented Apr 9, 2025

miss-typed enter, sent the comment above too early.

There are multiple ways to solve this issue:

  • fix the FunctionAttr to run DCE again once the convergent attribute is removed to make sure those invalid cases don't happen.
  • fix the FunctionAttr to not remove convergent if the function is used in a convergent call (the linked PR)
  • fix DCE to now remove convergence intrinsics, thus preventing FunctionAttr to wrongly remove the convergence attribute because the presence on such intrinsic is a sign the function needs it.

@arsenm
Copy link
Contributor

arsenm commented Apr 9, 2025

Thinking we should remove the IR verification requirement. It's more of a lint type check and violations would be UB

@ssahasra
Copy link
Collaborator

ssahasra commented Apr 10, 2025

Thanks for the example! I think 133684 #134863 is a correct incremental step forward, but maybe just going the whole way bottom-up is better. It is also correct to remove the verifier check. It's not a well-formedness error or even a semantic error to pass a convergence control token to a non-convergent call. It is also not UB. The only outcome is that the user will not get the convergence they expected, and this is not even observable since there are no convergent operations that will be affected by this unused token.

@nhaehnle does this look okay to you?

@ssahasra
Copy link
Collaborator

To take this to its logical conclusion, when convergence tokens are in use, the convergent attribute is redundant. All we need is a noconvergent attribute for function declarations. A function definition is convergent iff the body contains a call to the entry intrinsic, and a function declaration is assumed to be convergent unless it has a noconvergent attribute on it. So the immediate impact on the specification is that everywhere including the verifier, it is always okay to ignore the convergent attribute on a function definition if convergence tokens are in use.

@Keenuts
Copy link
Contributor Author

Keenuts commented Apr 10, 2025

Thanks for the example! I think 133684 #134863 is a correct incremental step forward

I suppose the PR number you mention is the wrong one?

@Keenuts
Copy link
Contributor Author

Keenuts commented Apr 10, 2025

It is also correct to remove the verifier check. It's not a well-formedness error or even a semantic error to pass a convergence control token to a non-convergent call. It is also not UB. The only outcome is that the user will not get the convergence they expected, and this is not even observable since there are no convergent operations that will be affected by this unused token.

Thanks for the details!

Keenuts added a commit to Keenuts/llvm-project that referenced this pull request Apr 15, 2025
Before this commit, having a convergence token on a non-convergent
call was considered to be an error.
This commit relaxes this requirement and allows convergence tokens
to be present on non-convergent calls.

When such token is present, they have no effect as the underlying
call is non-convergent.
This allows passes like DCE to strip `convergent` attribute from
functions for which all convergent operations have been stripped.
When this happens, a convergence token can still exist in the
call-site, causing the verifier to complain.

Alternatives have been considered in llvm#134863 and llvm#134844.
@s-perron s-perron removed their request for review April 16, 2025 18:34
Keenuts added a commit that referenced this pull request May 2, 2025
Before this commit, having a convergence token on a non-convergent call
was considered to be an error.
This commit relaxes this requirement and allows convergence tokens to be
present on non-convergent calls.

When such token is present, they have no effect as the underlying call
is non-convergent.
This allows passes like DCE to strip `convergent` attribute from
functions for which all convergent operations have been stripped. When
this happens, a convergence token can still exist in the call-site,
causing the verifier to complain.

Alternatives have been considered in #134863 and #134844.
@Keenuts
Copy link
Contributor Author

Keenuts commented May 2, 2025

Closed in favor of #135794

@Keenuts Keenuts closed this May 2, 2025
@github-project-automation github-project-automation bot moved this to Closed in HLSL Support May 2, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Before this commit, having a convergence token on a non-convergent call
was considered to be an error.
This commit relaxes this requirement and allows convergence tokens to be
present on non-convergent calls.

When such token is present, they have no effect as the underlying call
is non-convergent.
This allows passes like DCE to strip `convergent` attribute from
functions for which all convergent operations have been stripped. When
this happens, a convergence token can still exist in the call-site,
causing the verifier to complain.

Alternatives have been considered in llvm#134863 and llvm#134844.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Before this commit, having a convergence token on a non-convergent call
was considered to be an error.
This commit relaxes this requirement and allows convergence tokens to be
present on non-convergent calls.

When such token is present, they have no effect as the underlying call
is non-convergent.
This allows passes like DCE to strip `convergent` attribute from
functions for which all convergent operations have been stripped. When
this happens, a convergence token can still exist in the call-site,
causing the verifier to complain.

Alternatives have been considered in llvm#134863 and llvm#134844.
@damyanp damyanp removed this from HLSL Support Jun 25, 2025
@Keenuts Keenuts deleted the convergent-side-effect branch September 9, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:SPIR-V clang Clang issues not falling into any other category HLSL HLSL Language Support llvm:ir llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants