Skip to content

Conversation

@s-perron
Copy link
Contributor

Inlining currently assumes that either all function use controled
convergence or none of them do. This is why we need to have the entry
point wrapper use controled convergence.

// FIXME: The check below is redundant and incomplete. According to spec, if a
// convergent call is missing a token, then the caller is using uncontrolled
// convergence. If the callee has an entry intrinsic, then the callee is using
// controlled convergence, and the call cannot be inlined. A proper
// implemenation of this check requires a whole new analysis that identifies
// convergence in every function. For now, we skip that and just do this one
// cursory check. The underlying assumption is that in a compiler flow that
// fully implements convergence control tokens, there is no mixing of
// controlled and uncontrolled convergent operations in the whole program.

@s-perron s-perron requested review from Keenuts and bogner October 17, 2024 18:13
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. HLSL HLSL Language Support labels Oct 17, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 17, 2024

@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-clang

Author: Steven Perron (s-perron)

Changes

Inlining currently assumes that either all function use controled
convergence or none of them do. This is why we need to have the entry
point wrapper use controled convergence.

// FIXME: The check below is redundant and incomplete. According to spec, if a
// convergent call is missing a token, then the caller is using uncontrolled
// convergence. If the callee has an entry intrinsic, then the callee is using
// controlled convergence, and the call cannot be inlined. A proper
// implemenation of this check requires a whole new analysis that identifies
// convergence in every function. For now, we skip that and just do this one
// cursory check. The underlying assumption is that in a compiler flow that
// fully implements convergence control tokens, there is no mixing of
// controlled and uncontrolled convergent operations in the whole program.


Full diff: https://github.com/llvm/llvm-project/pull/112757.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGHLSLRuntime.cpp (+37-4)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.h (+1)
  • (added) clang/test/CodeGenHLSL/convergence/entry.point.hlsl (+11)
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 3237d93ca31ceb..d006f5d8f3c1cd 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -399,6 +399,17 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl *FD,
   BasicBlock *BB = BasicBlock::Create(Ctx, "entry", EntryFn);
   IRBuilder<> B(BB);
   llvm::SmallVector<Value *> Args;
+
+  SmallVector<OperandBundleDef, 1> OB;
+  if (CGM.shouldEmitConvergenceTokens()) {
+    assert(EntryFn->isConvergent());
+    llvm::Value *
+        I = B.CreateIntrinsic(llvm::Intrinsic::experimental_convergence_entry, {},
+                              {});
+    llvm::Value *bundleArgs[] = {I};
+    OB.emplace_back("convergencectrl", bundleArgs);
+  }
+
   // FIXME: support struct parameters where semantics are on members.
   // See: https://github.com/llvm/llvm-project/issues/57874
   unsigned SRetOffset = 0;
@@ -414,7 +425,7 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl *FD,
     Args.push_back(emitInputSemantic(B, *PD, Param.getType()));
   }
 
-  CallInst *CI = B.CreateCall(FunctionCallee(Fn), Args);
+  CallInst *CI = B.CreateCall(FunctionCallee(Fn), Args, OB);
   CI->setCallingConv(Fn->getCallingConv());
   // FIXME: Handle codegen for return type semantics.
   // See: https://github.com/llvm/llvm-project/issues/57875
@@ -469,14 +480,21 @@ void CGHLSLRuntime::generateGlobalCtorDtorCalls() {
   for (auto &F : M.functions()) {
     if (!F.hasFnAttribute("hlsl.shader"))
       continue;
-    IRBuilder<> B(&F.getEntryBlock(), F.getEntryBlock().begin());
+    auto* Token = getConvergenceToken(F.getEntryBlock());
+    Instruction* IP = Token ? Token : &*F.getEntryBlock().begin();
+    IRBuilder<> B(IP);
+    std::vector<OperandBundleDef> OB;
+    if (Token) {
+      llvm::Value *bundleArgs[] = {Token};
+      OB.emplace_back("convergencectrl", bundleArgs);
+    }
     for (auto *Fn : CtorFns)
-      B.CreateCall(FunctionCallee(Fn));
+      B.CreateCall(FunctionCallee(Fn), {}, OB);
 
     // Insert global dtors before the terminator of the last instruction
     B.SetInsertPoint(F.back().getTerminator());
     for (auto *Fn : DtorFns)
-      B.CreateCall(FunctionCallee(Fn));
+      B.CreateCall(FunctionCallee(Fn), {}, OB);
   }
 
   // No need to keep global ctors/dtors for non-lib profile after call to
@@ -489,3 +507,18 @@ void CGHLSLRuntime::generateGlobalCtorDtorCalls() {
       GV->eraseFromParent();
   }
 }
+
+llvm::Instruction *CGHLSLRuntime::getConvergenceToken(BasicBlock &BB) {
+  if (!CGM.shouldEmitConvergenceTokens())
+    return nullptr;
+
+  auto E = BB.end();
+  for(auto I = BB.begin(); I != E; ++I) {
+    auto *II = dyn_cast<llvm::IntrinsicInst>(&*I);
+    if (II && llvm::isConvergenceControlIntrinsic(II->getIntrinsicID())) {
+      return II;
+    }
+  }
+  llvm_unreachable("Convergence token should have been emitted.");
+  return nullptr;
+}
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index f7621ee20b1243..3eb56cd5449704 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -137,6 +137,7 @@ class CGHLSLRuntime {
 
   void emitEntryFunction(const FunctionDecl *FD, llvm::Function *Fn);
   void setHLSLFunctionAttributes(const FunctionDecl *FD, llvm::Function *Fn);
+  llvm::Instruction *getConvergenceToken(llvm::BasicBlock &BB);
 
 private:
   void addBufferResourceAnnotation(llvm::GlobalVariable *GV,
diff --git a/clang/test/CodeGenHLSL/convergence/entry.point.hlsl b/clang/test/CodeGenHLSL/convergence/entry.point.hlsl
new file mode 100644
index 00000000000000..a848c834da3535
--- /dev/null
+++ b/clang/test/CodeGenHLSL/convergence/entry.point.hlsl
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 -triple spirv-pc-vulkan-compute -finclude-default-header -fnative-half-type -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// CHECK-LABEL: define void @main()
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[token:%[0-9]+]] = call token @llvm.experimental.convergence.entry()
+// CHECK-NEXT: call spir_func void @_Z4mainv() [ "convergencectrl"(token [[token]]) ]
+
+[numthreads(1,1,1)]
+void main() {
+}
+

@llvmbot
Copy link
Member

llvmbot commented Oct 17, 2024

@llvm/pr-subscribers-clang-codegen

Author: Steven Perron (s-perron)

Changes

Inlining currently assumes that either all function use controled
convergence or none of them do. This is why we need to have the entry
point wrapper use controled convergence.

// FIXME: The check below is redundant and incomplete. According to spec, if a
// convergent call is missing a token, then the caller is using uncontrolled
// convergence. If the callee has an entry intrinsic, then the callee is using
// controlled convergence, and the call cannot be inlined. A proper
// implemenation of this check requires a whole new analysis that identifies
// convergence in every function. For now, we skip that and just do this one
// cursory check. The underlying assumption is that in a compiler flow that
// fully implements convergence control tokens, there is no mixing of
// controlled and uncontrolled convergent operations in the whole program.


Full diff: https://github.com/llvm/llvm-project/pull/112757.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGHLSLRuntime.cpp (+37-4)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.h (+1)
  • (added) clang/test/CodeGenHLSL/convergence/entry.point.hlsl (+11)
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 3237d93ca31ceb..d006f5d8f3c1cd 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -399,6 +399,17 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl *FD,
   BasicBlock *BB = BasicBlock::Create(Ctx, "entry", EntryFn);
   IRBuilder<> B(BB);
   llvm::SmallVector<Value *> Args;
+
+  SmallVector<OperandBundleDef, 1> OB;
+  if (CGM.shouldEmitConvergenceTokens()) {
+    assert(EntryFn->isConvergent());
+    llvm::Value *
+        I = B.CreateIntrinsic(llvm::Intrinsic::experimental_convergence_entry, {},
+                              {});
+    llvm::Value *bundleArgs[] = {I};
+    OB.emplace_back("convergencectrl", bundleArgs);
+  }
+
   // FIXME: support struct parameters where semantics are on members.
   // See: https://github.com/llvm/llvm-project/issues/57874
   unsigned SRetOffset = 0;
@@ -414,7 +425,7 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl *FD,
     Args.push_back(emitInputSemantic(B, *PD, Param.getType()));
   }
 
-  CallInst *CI = B.CreateCall(FunctionCallee(Fn), Args);
+  CallInst *CI = B.CreateCall(FunctionCallee(Fn), Args, OB);
   CI->setCallingConv(Fn->getCallingConv());
   // FIXME: Handle codegen for return type semantics.
   // See: https://github.com/llvm/llvm-project/issues/57875
@@ -469,14 +480,21 @@ void CGHLSLRuntime::generateGlobalCtorDtorCalls() {
   for (auto &F : M.functions()) {
     if (!F.hasFnAttribute("hlsl.shader"))
       continue;
-    IRBuilder<> B(&F.getEntryBlock(), F.getEntryBlock().begin());
+    auto* Token = getConvergenceToken(F.getEntryBlock());
+    Instruction* IP = Token ? Token : &*F.getEntryBlock().begin();
+    IRBuilder<> B(IP);
+    std::vector<OperandBundleDef> OB;
+    if (Token) {
+      llvm::Value *bundleArgs[] = {Token};
+      OB.emplace_back("convergencectrl", bundleArgs);
+    }
     for (auto *Fn : CtorFns)
-      B.CreateCall(FunctionCallee(Fn));
+      B.CreateCall(FunctionCallee(Fn), {}, OB);
 
     // Insert global dtors before the terminator of the last instruction
     B.SetInsertPoint(F.back().getTerminator());
     for (auto *Fn : DtorFns)
-      B.CreateCall(FunctionCallee(Fn));
+      B.CreateCall(FunctionCallee(Fn), {}, OB);
   }
 
   // No need to keep global ctors/dtors for non-lib profile after call to
@@ -489,3 +507,18 @@ void CGHLSLRuntime::generateGlobalCtorDtorCalls() {
       GV->eraseFromParent();
   }
 }
+
+llvm::Instruction *CGHLSLRuntime::getConvergenceToken(BasicBlock &BB) {
+  if (!CGM.shouldEmitConvergenceTokens())
+    return nullptr;
+
+  auto E = BB.end();
+  for(auto I = BB.begin(); I != E; ++I) {
+    auto *II = dyn_cast<llvm::IntrinsicInst>(&*I);
+    if (II && llvm::isConvergenceControlIntrinsic(II->getIntrinsicID())) {
+      return II;
+    }
+  }
+  llvm_unreachable("Convergence token should have been emitted.");
+  return nullptr;
+}
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index f7621ee20b1243..3eb56cd5449704 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -137,6 +137,7 @@ class CGHLSLRuntime {
 
   void emitEntryFunction(const FunctionDecl *FD, llvm::Function *Fn);
   void setHLSLFunctionAttributes(const FunctionDecl *FD, llvm::Function *Fn);
+  llvm::Instruction *getConvergenceToken(llvm::BasicBlock &BB);
 
 private:
   void addBufferResourceAnnotation(llvm::GlobalVariable *GV,
diff --git a/clang/test/CodeGenHLSL/convergence/entry.point.hlsl b/clang/test/CodeGenHLSL/convergence/entry.point.hlsl
new file mode 100644
index 00000000000000..a848c834da3535
--- /dev/null
+++ b/clang/test/CodeGenHLSL/convergence/entry.point.hlsl
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 -triple spirv-pc-vulkan-compute -finclude-default-header -fnative-half-type -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// CHECK-LABEL: define void @main()
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[token:%[0-9]+]] = call token @llvm.experimental.convergence.entry()
+// CHECK-NEXT: call spir_func void @_Z4mainv() [ "convergencectrl"(token [[token]]) ]
+
+[numthreads(1,1,1)]
+void main() {
+}
+

Comment on lines 483 to 494
auto* Token = getConvergenceToken(F.getEntryBlock());
Instruction* IP = Token ? Token : &*F.getEntryBlock().begin();
IRBuilder<> B(IP);
std::vector<OperandBundleDef> OB;
if (Token) {
llvm::Value *bundleArgs[] = {Token};
OB.emplace_back("convergencectrl", bundleArgs);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot test this code yet. Any path that I know of that would exercise this code will fail because we cannot translate HLSL types to SPIR-V types yet.

However, I do have an initial implementation for that in https://github.com/s-perron/llvm-project/tree/spirv_hlsl_type which showes this works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this is merged I will open up a PR for that branch.

@github-actions
Copy link

github-actions bot commented Oct 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@s-perron s-perron marked this pull request as draft October 17, 2024 19:50
@s-perron s-perron marked this pull request as ready for review October 17, 2024 20:40
@s-perron s-perron self-assigned this Oct 18, 2024
Copy link
Member

@farzonl farzonl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks correct. Issues found were minor so LGTM.

Inlining currently assumes that either all function use controled
convergence or none of them do. This is why we need to have the entry
point wrapper use controled convergence.

https://github.com/llvm/llvm-project/blob/c85611e8583e6392d56075ebdfa60893b6284813/llvm/lib/Transforms/Utils/InlineFunction.cpp#L2431-L2439
@s-perron s-perron merged commit 98e3075 into llvm:main Oct 28, 2024
8 checks passed
@s-perron s-perron deleted the main_convergence branch October 28, 2024 17:25
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
)

Inlining currently assumes that either all function use controled
convergence or none of them do. This is why we need to have the entry
point wrapper use controled convergence.


https://github.com/llvm/llvm-project/blob/c85611e8583e6392d56075ebdfa60893b6284813/llvm/lib/Transforms/Utils/InlineFunction.cpp#L2431-L2439
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category HLSL HLSL Language Support

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants