Skip to content

Conversation

@clementval
Copy link
Contributor

No description provided.

@llvmbot llvmbot added flang:runtime flang Flang issues not falling into any other category labels Oct 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 28, 2024

@llvm/pr-subscribers-flang-runtime

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/113958.diff

2 Files Affected:

  • (modified) flang/include/flang/Runtime/CUDA/kernel.h (+7-1)
  • (modified) flang/runtime/CUDA/kernel.cpp (+24-1)
diff --git a/flang/include/flang/Runtime/CUDA/kernel.h b/flang/include/flang/Runtime/CUDA/kernel.h
index cf07d874a082c0..85afda09e347ae 100644
--- a/flang/include/flang/Runtime/CUDA/kernel.h
+++ b/flang/include/flang/Runtime/CUDA/kernel.h
@@ -15,13 +15,19 @@
 
 extern "C" {
 
-// This function uses intptr_t instead of CUDA's unsigned int to match
+// These functions use intptr_t instead of CUDA's unsigned int to match
 // the type of MLIR's index type. This avoids the need for casts in the
 // generated MLIR code.
+
 void RTDEF(CUFLaunchKernel)(const void *kernelName, intptr_t gridX,
     intptr_t gridY, intptr_t gridZ, intptr_t blockX, intptr_t blockY,
     intptr_t blockZ, int32_t smem, void **params, void **extra);
 
+void RTDEF(CUFLaunchClusterKernel)(const void *kernelName, intptr_t clusterX,
+    intptr_t clusterY, intptr_t clusterZ, intptr_t gridX, intptr_t gridY,
+    intptr_t gridZ, intptr_t blockX, intptr_t blockY, intptr_t blockZ,
+    int32_t smem, void **params, void **extra);
+
 } // extern "C"
 
 #endif // FORTRAN_RUNTIME_CUDA_KERNEL_H_
diff --git a/flang/runtime/CUDA/kernel.cpp b/flang/runtime/CUDA/kernel.cpp
index f81153a1af4bc7..abb7ebb72e5923 100644
--- a/flang/runtime/CUDA/kernel.cpp
+++ b/flang/runtime/CUDA/kernel.cpp
@@ -25,9 +25,32 @@ void RTDEF(CUFLaunchKernel)(const void *kernel, intptr_t gridX, intptr_t gridY,
   blockDim.x = blockX;
   blockDim.y = blockY;
   blockDim.z = blockZ;
-  cudaStream_t stream = 0;
+  cudaStream_t stream = 0; // TODO stream managment
   CUDA_REPORT_IF_ERROR(
       cudaLaunchKernel(kernel, gridDim, blockDim, params, smem, stream));
 }
 
+void RTDEF(CUFLaunchClusterKernel)(const void *kernel, intptr_t clusterX,
+    intptr_t clusterY, intptr_t clusterZ, intptr_t gridX, intptr_t gridY,
+    intptr_t gridZ, intptr_t blockX, intptr_t blockY, intptr_t blockZ,
+    int32_t smem, void **params, void **extra) {
+  cudaLaunchConfig_t config;
+  config.gridDim.x = gridX;
+  config.gridDim.y = gridY;
+  config.gridDim.z = gridZ;
+  config.blockDim.x = blockX;
+  config.blockDim.y = blockY;
+  config.blockDim.z = blockZ;
+  config.dynamicSmemBytes = smem;
+  config.stream = 0; // TODO stream managment
+  cudaLaunchAttribute launchAttr[1];
+  launchAttr[0].id = cudaLaunchAttributeClusterDimension;
+  launchAttr[0].val.clusterDim.x = clusterX;
+  launchAttr[0].val.clusterDim.y = clusterY;
+  launchAttr[0].val.clusterDim.z = clusterZ;
+  config.numAttrs = 1;
+  config.attrs = launchAttr;
+  CUDA_REPORT_IF_ERROR(cudaLaunchKernelExC(&config, kernel, params));
+}
+
 } // extern "C"

@clementval clementval merged commit 0b700f2 into main Oct 29, 2024
11 checks passed
@clementval clementval deleted the users/clementval/cuf_cluster_rt branch October 29, 2024 17:01
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:runtime flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants