Skip to content

Conversation

@khaki3
Copy link
Contributor

@khaki3 khaki3 commented Dec 23, 2024

We set the gridX argument of _FortranACUFLaunchKernel to -1 when * is passed to the grid parameter. We store it in one of dim3 members. However, dim3 members are unsigned, so positive-value checks we use later, such as gridDim.x > 0, are invalid. This PR utilizes the original gird-size arguments to compute the number of blocks.

@khaki3 khaki3 requested a review from clementval December 23, 2024 21:09
@llvmbot llvmbot added flang:runtime flang Flang issues not falling into any other category labels Dec 23, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 23, 2024

@llvm/pr-subscribers-flang-runtime

Author: None (khaki3)

Changes

We set the gridX argument of _FortranACUFLaunchKernel to -1 when * is passed to the grid parameter. We store it in one of dim3 members. However, dim3 members are unsigned, so positive-value checks we use later, such as gridDim.x > 0, are invalid. This PR utilizes the original gird-size arguments to compute the number of blocks.


Full diff: https://github.com/llvm/llvm-project/pull/121000.diff

1 Files Affected:

  • (modified) flang/runtime/CUDA/kernel.cpp (+6-6)
diff --git a/flang/runtime/CUDA/kernel.cpp b/flang/runtime/CUDA/kernel.cpp
index 88cdf3cf426229..bdc04ccb17672b 100644
--- a/flang/runtime/CUDA/kernel.cpp
+++ b/flang/runtime/CUDA/kernel.cpp
@@ -48,13 +48,13 @@ void RTDEF(CUFLaunchKernel)(const void *kernel, intptr_t gridX, intptr_t gridY,
       maxBlocks = multiProcCount * maxBlocks;
     }
     if (maxBlocks > 0) {
-      if (gridDim.x > 0) {
+      if (gridX > 0) {
         maxBlocks = maxBlocks / gridDim.x;
       }
-      if (gridDim.y > 0) {
+      if (gridY > 0) {
         maxBlocks = maxBlocks / gridDim.y;
       }
-      if (gridDim.z > 0) {
+      if (gridZ > 0) {
         maxBlocks = maxBlocks / gridDim.z;
       }
       if (maxBlocks < 1) {
@@ -113,13 +113,13 @@ void RTDEF(CUFLaunchClusterKernel)(const void *kernel, intptr_t clusterX,
       maxBlocks = multiProcCount * maxBlocks;
     }
     if (maxBlocks > 0) {
-      if (config.gridDim.x > 0) {
+      if (gridX > 0) {
         maxBlocks = maxBlocks / config.gridDim.x;
       }
-      if (config.gridDim.y > 0) {
+      if (gridY > 0) {
         maxBlocks = maxBlocks / config.gridDim.y;
       }
-      if (config.gridDim.z > 0) {
+      if (gridZ > 0) {
         maxBlocks = maxBlocks / config.gridDim.z;
       }
       if (maxBlocks < 1) {

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for catching this!

@khaki3 khaki3 merged commit 7d166fa into llvm:main Dec 24, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:runtime flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants