[mlir][gpu] Add verification to disallow nested `gpu.launch` ops #151968

CoTinker · 2025-08-04T13:53:09Z

This PR adds a verification check in LaunchOp::verify() to disallow nested gpu.launch operations. Nested gpu.launch is currently unsupported and can lead to undefined or unintended behavior during lowering. This change ensures that such cases are caught early during IR verification. Fixes #149318.

This PR adds a verification check in `LaunchOp::verify()` to disallow nested `gpu.launch` operations. Nested `gpu.launch` is currently unsupported and can lead to undefined or unintended behavior during lowering. This change ensures that such cases are caught early during IR verification.

llvmbot · 2025-08-04T13:53:44Z

@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir

Author: Longsheng Mou (CoTinker)

Changes

This PR adds a verification check in LaunchOp::verify() to disallow nested gpu.launch operations. Nested gpu.launch is currently unsupported and can lead to undefined or unintended behavior during lowering. This change ensures that such cases are caught early during IR verification. Fixes #149318.

Full diff: https://github.com/llvm/llvm-project/pull/151968.diff

2 Files Affected:

(modified) mlir/lib/Dialect/GPU/IR/GPUDialect.cpp (+3)
(modified) mlir/test/Dialect/GPU/invalid.mlir (+15)

diff --git a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
index 5a72ef17db7f0..d6438d355fec1 100644
--- a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+++ b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
@@ -866,6 +866,9 @@ LogicalResult LaunchOp::verify() {
   if (!(hasClusterSize()) &&
       (getClusterSizeX() || getClusterSizeY() || getClusterSizeZ()))
     return emitOpError() << "cluster size must be all present";
+
+  if (getOperation()->getParentOfType<LaunchOp>())
+    return emitOpError() << "not support nested launches";
   return success();
 }
 
diff --git a/mlir/test/Dialect/GPU/invalid.mlir b/mlir/test/Dialect/GPU/invalid.mlir
index 35381dab7b200..4606dabb59cbe 100644
--- a/mlir/test/Dialect/GPU/invalid.mlir
+++ b/mlir/test/Dialect/GPU/invalid.mlir
@@ -35,6 +35,21 @@ func.func @launch_requires_gpu_return(%sz : index) {
 
 // -----
 
+func.func @nested_launches(%sz : index) {
+  gpu.launch blocks(%bx, %by, %bz) in (%sbx = %sz, %sby = %sz, %sbz = %sz)
+             threads(%tx, %ty, %tz) in (%stx = %sz, %sty = %sz, %stz = %sz) {
+    // @expected-error@+1 {{'gpu.launch' op not support nested launches}}
+    gpu.launch blocks(%bx1, %by1, %bz1) in (%sbx1 = %sz, %sby1 = %sz, %sbz1 = %sz)
+               threads(%tx1, %ty1, %tz1) in (%stx1 = %sz, %sty1 = %sz, %stz1 = %sz) {
+      gpu.terminator
+    }
+    gpu.terminator
+  }
+  return
+}
+
+// -----
+
 func.func @launch_func_too_few_operands(%sz : index) {
   // expected-error@+1 {{expected 6 or more operands}}
   "gpu.launch_func"(%sz, %sz, %sz, %sz, %sz)

joker-eph · 2025-08-04T13:54:50Z

I don't believe this should be a verifier error because that does not compose with inlining. Making this part of the verifier would mean that the inliner transformation would be subject to create invalid IR without any possibility to prevent this.

Instead we should catch this in gpu-kernel-outlining and error out appropriately.

CoTinker · 2025-08-04T13:58:57Z

I don't believe this should be a verifier error because that does not compose with inlining. Making this part of the verifier would mean that the inliner transformation would be subject to create invalid IR without any possibility to prevent this.

Instead we should catch this in gpu-kernel-outlining and error out appropriately.

Okay, I will submit a new PR to fix this issue.

grypp · 2025-08-04T16:32:52Z

Nested gpu.launch is valid IR, and we should not report an error in the target‑independent gpu dialect or during gpu-kernel-outlining.

For example, in CUDA supports nested kernel launch, it's called dynamic parallelism, which allows launching a kernel from within another kernel. A lowering for this could be implemented today.

However, since we currently don’t have such a lowering, we could emit an error in the gpu-to-nvvm pass (NVIDIA‑specific). Other vendors can add similar lowering or diagnostics in their respective passes.

CoTinker · 2025-08-04T22:57:53Z

For example, in CUDA supports nested kernel launch, it's called dynamic parallelism, which allows launching a kernel from within another kernel. A lowering for this could be implemented today.

Thanks for your reply, that's mean we should support lowering nested gpu.launch in gpu-kernel-outlining. I'll implement it.

CoTinker requested review from grypp, joker-eph, krzysz00 and kuhar August 4, 2025 13:53

llvmbot added mlir:gpu mlir labels Aug 4, 2025

CoTinker closed this Aug 4, 2025

CoTinker deleted the nested_launch branch August 8, 2025 03:44

CoTinker mentioned this pull request Aug 8, 2025

[mlir][gpu] Support outlining nested gpu.launch #152696

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][gpu] Add verification to disallow nested `gpu.launch` ops #151968

[mlir][gpu] Add verification to disallow nested `gpu.launch` ops #151968

Uh oh!

CoTinker commented Aug 4, 2025

Uh oh!

llvmbot commented Aug 4, 2025 •

edited

Loading

Uh oh!

joker-eph commented Aug 4, 2025 •

edited

Loading

Uh oh!

CoTinker commented Aug 4, 2025

Uh oh!

grypp commented Aug 4, 2025

Uh oh!

CoTinker commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[mlir][gpu] Add verification to disallow nested gpu.launch ops #151968

[mlir][gpu] Add verification to disallow nested gpu.launch ops #151968

Uh oh!

Conversation

CoTinker commented Aug 4, 2025

Uh oh!

llvmbot commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joker-eph commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CoTinker commented Aug 4, 2025

Uh oh!

grypp commented Aug 4, 2025

Uh oh!

CoTinker commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[mlir][gpu] Add verification to disallow nested `gpu.launch` ops #151968

[mlir][gpu] Add verification to disallow nested `gpu.launch` ops #151968

llvmbot commented Aug 4, 2025 •

edited

Loading

joker-eph commented Aug 4, 2025 •

edited

Loading