-
Notifications
You must be signed in to change notification settings - Fork 25
Add cluster launch support to water backend and runtime #795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds cluster launch support to the water backend and runtime, enabling GPU kernels to use cluster dimensions when launching kernels on AMD GPUs.
Changes:
- Extended the
wave_launch_kernelfunction signature to accept cluster dimension parameters (cluster_x, cluster_y, cluster_z) - Implemented conditional cluster launch logic using
hipDrvLaunchKernelExwhen cluster dimensions are specified, falling back to regularhipModuleLaunchKernelotherwise - Updated the MLIR GPU-to-GPU-Runtime pass to propagate cluster size attributes from
gpu.launch_funcoperations to the runtime
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| wave_lang/kernel/wave/execution_engine/wave_hip_runtime.h | Added cluster dimension parameters (cluster_x, cluster_y, cluster_z) to the wave_launch_kernel function signature |
| wave_lang/kernel/wave/execution_engine/wave_hip_runtime.cpp | Implemented cluster launch logic using hipDrvLaunchKernelEx with cluster attributes, including optional loading of the function pointer and fallback to regular launch when cluster dimensions are not specified |
| wave_lang/kernel/compiler/wave_codegen/emitter.py | Added cluster_size generation from hardware constraints and passed it to gpu_d.launch_func, removed unused async_dependencies parameter |
| water/lib/Transforms/GPUToGPURuntime.cpp | Extended wave_launch_kernel function signature with cluster dimension parameters, added logic to extract and pass cluster dimensions (defaulting to 0) from gpu.launch_func operations |
| water/test/Transforms/gpu-to-gpu-runtime.mlir | Added test case for cluster launch and updated existing test expectations to account for new cluster dimension parameters |
| tests/kernel/wave_gemm_test.py | Added use_water_backend parameter to testTensorLoadToShared to enable testing with water backend |
| lit_tests/kernel/wave/water_host_wrapper.py | Added test_cluster_dims test case to verify cluster dimensions are properly emitted in gpu.launch_func |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
gpu.launch_funccluster_dimsargs. during codegen.GPUToGPURuntime.cpppass and runtime.