Skip to content

Conversation

@Hardcode84
Copy link
Contributor

  • use gpu.launch_func cluster_dims args. during codegen.
  • Update GPUToGPURuntime.cpp pass and runtime.
  • lit and e2e tests.

Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds cluster launch support to the water backend and runtime, enabling GPU kernels to use cluster dimensions when launching kernels on AMD GPUs.

Changes:

  • Extended the wave_launch_kernel function signature to accept cluster dimension parameters (cluster_x, cluster_y, cluster_z)
  • Implemented conditional cluster launch logic using hipDrvLaunchKernelEx when cluster dimensions are specified, falling back to regular hipModuleLaunchKernel otherwise
  • Updated the MLIR GPU-to-GPU-Runtime pass to propagate cluster size attributes from gpu.launch_func operations to the runtime

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
wave_lang/kernel/wave/execution_engine/wave_hip_runtime.h Added cluster dimension parameters (cluster_x, cluster_y, cluster_z) to the wave_launch_kernel function signature
wave_lang/kernel/wave/execution_engine/wave_hip_runtime.cpp Implemented cluster launch logic using hipDrvLaunchKernelEx with cluster attributes, including optional loading of the function pointer and fallback to regular launch when cluster dimensions are not specified
wave_lang/kernel/compiler/wave_codegen/emitter.py Added cluster_size generation from hardware constraints and passed it to gpu_d.launch_func, removed unused async_dependencies parameter
water/lib/Transforms/GPUToGPURuntime.cpp Extended wave_launch_kernel function signature with cluster dimension parameters, added logic to extract and pass cluster dimensions (defaulting to 0) from gpu.launch_func operations
water/test/Transforms/gpu-to-gpu-runtime.mlir Added test case for cluster launch and updated existing test expectations to account for new cluster dimension parameters
tests/kernel/wave_gemm_test.py Added use_water_backend parameter to testTensorLoadToShared to enable testing with water backend
lit_tests/kernel/wave/water_host_wrapper.py Added test_cluster_dims test case to verify cluster dimensions are properly emitted in gpu.launch_func

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant