Skip to content

Conversation

@sohaibiftikhar
Copy link
Member

@sohaibiftikhar sohaibiftikhar commented Dec 3, 2025

Without this mapping fails when there is no result specified.

See: #169922 (comment)

To reproduce error on main:

mkdir -p build && cd build
cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS=mlir \
  -DLLVM_TARGETS_TO_BUILD="host;NVPTX" \
  -DMLIR_ENABLE_CUDA_RUNNER=ON \
  -DMLIR_RUN_CUDA_TENSOR_CORE_TESTS=ON \
  -DMLIR_RUN_CUDA_SM90_TESTS=ON \
  -DMLIR_GPU_COMPILATION_TEST_FORMAT=fatbin \
  -DMLIR_INCLUDE_INTEGRATION_TESTS=ON \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_INSTALL_UTILS=ON \
  -DCMAKE_BUILD_TYPE=Release
ninja

bin/mlir-opt ../mlir/test/Integration/GPU/CUDA/sm90/tma_load_128x64_swizzle128b.mlir -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_90 cubin-features=+ptx80 opt-level=3"

@llvmbot
Copy link
Member

llvmbot commented Dec 3, 2025

@llvm/pr-subscribers-mlir

Author: Sohaib Iftikhar (sohaibiftikhar)

Changes

Without this mapping fails when there is no result specified.


Full diff: https://github.com/llvm/llvm-project/pull/170545.diff

1 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td (+2-5)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index a96d65d3fcacd..cb83ec23bc76e 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -948,8 +948,7 @@ def NVVM_MBarrierArriveExpectTxOp : NVVM_PTXBuilder_Op<"mbarrier.arrive.expect_t
     auto [id, args] = NVVM::MBarrierArriveExpectTxOp::getIntrinsicIDAndArgs(
                       *op, moduleTranslation, builder);
 
-    int addrSpace = llvm::cast<LLVMPointerType>(op.getAddr().getType()).getAddressSpace();
-    if (addrSpace != NVVM::NVVMMemorySpace::SharedCluster)
+    if (op.getNumResults() > 0)
       $res = createIntrinsicCall(builder, id, args);
     else
       createIntrinsicCall(builder, id, args);
@@ -985,9 +984,7 @@ def NVVM_MBarrierArriveDropExpectTxOp : NVVM_Op<"mbarrier.arrive_drop.expect_tx"
   string llvmBuilder = [{
     auto [id, args] = NVVM::MBarrierArriveDropExpectTxOp::getIntrinsicIDAndArgs(
                       *op, moduleTranslation, builder);
-
-    int addrSpace = llvm::cast<LLVMPointerType>(op.getAddr().getType()).getAddressSpace();
-    if (addrSpace != NVVM::NVVMMemorySpace::SharedCluster)
+    if (op.getNumResults() > 0)
       $res = createIntrinsicCall(builder, id, args);
     else
       createIntrinsicCall(builder, id, args);

@llvmbot
Copy link
Member

llvmbot commented Dec 3, 2025

@llvm/pr-subscribers-mlir-llvm

Author: Sohaib Iftikhar (sohaibiftikhar)

Changes

Without this mapping fails when there is no result specified.


Full diff: https://github.com/llvm/llvm-project/pull/170545.diff

1 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td (+2-5)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index a96d65d3fcacd..cb83ec23bc76e 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -948,8 +948,7 @@ def NVVM_MBarrierArriveExpectTxOp : NVVM_PTXBuilder_Op<"mbarrier.arrive.expect_t
     auto [id, args] = NVVM::MBarrierArriveExpectTxOp::getIntrinsicIDAndArgs(
                       *op, moduleTranslation, builder);
 
-    int addrSpace = llvm::cast<LLVMPointerType>(op.getAddr().getType()).getAddressSpace();
-    if (addrSpace != NVVM::NVVMMemorySpace::SharedCluster)
+    if (op.getNumResults() > 0)
       $res = createIntrinsicCall(builder, id, args);
     else
       createIntrinsicCall(builder, id, args);
@@ -985,9 +984,7 @@ def NVVM_MBarrierArriveDropExpectTxOp : NVVM_Op<"mbarrier.arrive_drop.expect_tx"
   string llvmBuilder = [{
     auto [id, args] = NVVM::MBarrierArriveDropExpectTxOp::getIntrinsicIDAndArgs(
                       *op, moduleTranslation, builder);
-
-    int addrSpace = llvm::cast<LLVMPointerType>(op.getAddr().getType()).getAddressSpace();
-    if (addrSpace != NVVM::NVVMMemorySpace::SharedCluster)
+    if (op.getNumResults() > 0)
       $res = createIntrinsicCall(builder, id, args);
     else
       createIntrinsicCall(builder, id, args);

Copy link
Contributor

@durga4github durga4github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix, LGTM.

I will try this out on my end too.

@sohaibiftikhar sohaibiftikhar merged commit dd6e87b into llvm:main Dec 4, 2025
13 checks passed
@sohaibiftikhar sohaibiftikhar deleted the r169922 branch December 4, 2025 11:31
kcloudy0717 pushed a commit to kcloudy0717/llvm-project that referenced this pull request Dec 4, 2025
Without this mapping fails when there is no result specified.

See:
llvm#169922 (comment)

To reproduce error on `main`:

```bash
mkdir -p build && cd build
cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS=mlir \
  -DLLVM_TARGETS_TO_BUILD="host;NVPTX" \
  -DMLIR_ENABLE_CUDA_RUNNER=ON \
  -DMLIR_RUN_CUDA_TENSOR_CORE_TESTS=ON \
  -DMLIR_RUN_CUDA_SM90_TESTS=ON \
  -DMLIR_GPU_COMPILATION_TEST_FORMAT=fatbin \
  -DMLIR_INCLUDE_INTEGRATION_TESTS=ON \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_INSTALL_UTILS=ON \
  -DCMAKE_BUILD_TYPE=Release
ninja

bin/mlir-opt ../mlir/test/Integration/GPU/CUDA/sm90/tma_load_128x64_swizzle128b.mlir -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_90 cubin-features=+ptx80 opt-level=3"
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants