[CK_TILE] Restructure Tile Engine's benchmarking and profiling#4769
[CK_TILE] Restructure Tile Engine's benchmarking and profiling#4769
Conversation
projects/composablekernel/tile_engine/ops/gemm/gemm_multi_d/gemm_multi_d_benchmark.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR restructures the Tile Engine's benchmarking and profiling infrastructure to reduce code duplication and simplify the addition of new operations. The refactoring introduces a base class architecture with common utilities extracted into shared files, while maintaining backward compatibility with existing tests.
Changes:
- Introduced base profiler class (
GemmProfiler) and common utility files to eliminate duplicated code across GEMM variants - Restructured GEMM universal, preshuffle, and multi_d implementations to inherit from the new base classes
- Consolidated Python benchmarking utilities into a shared module accessible by all GEMM variants
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| gemm_universal_profiler.hpp | New universal GEMM profiler inheriting from base GemmProfiler class |
| gemm_profiler.hpp | New base profiler template class providing common profiling functionality |
| common/utils.hpp | Common utility functions and structs (Metric, PerformanceResult, KernelInstance, Setting) |
| gemm_benchmark.hpp | Base problem definition and comparison utilities for GEMM operations |
| gemm_common.hpp | Shared kernel traits and argument parser creation |
| common/benchmark_utils.py | Consolidated Python utilities for running kernels and exporting results |
| gemm_benchmark.py | Base Python class for GEMM benchmarking with kernel discovery and execution |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
projects/composablekernel/tile_engine/ops/gemm/gemm_multi_d/gemm_multi_d_benchmark.py
Show resolved
Hide resolved
projects/composablekernel/tile_engine/ops/gemm/gemm_preshuffle/gemm_preshuffle_benchmark.py
Show resolved
Hide resolved
projects/composablekernel/tile_engine/ops/gemm/gemm_preshuffle/gemm_preshuffle_profiler.hpp
Outdated
Show resolved
Hide resolved
projects/composablekernel/tile_engine/ops/gemm/gemm_multi_d/gemm_multi_d_benchmark.py
Outdated
Show resolved
Hide resolved
projects/composablekernel/tile_engine/ops/gemm/gemm_universal/gemm_universal_benchmark.py
Outdated
Show resolved
Hide resolved
18ab3a0 to
f0a234b
Compare
| template <typename Layout> | ||
| constexpr auto is_row_major(Layout) | ||
| { | ||
| return ck_tile::bool_constant<std::is_same_v<Layout, ck_tile::tensor_layout::gemm::RowMajor>>{}; |
There was a problem hiding this comment.
Can we just return the boolean value if std::is_same_v instead of wrapping it inside of a bool_constant?
There was a problem hiding this comment.
Other commonly used functions, like get_default_stride, that call is_row_major are expecting the bool_constant return type.
a88950e to
c18b8a0
Compare
c18b8a0 to
2c2dfd7
Compare
6727d4e to
6b9cdb4
Compare
5d97818 to
42c57c9
Compare
cf3ac50 to
1ec6ddd
Compare
This change restructures the Benchmark structs into 3 files. There is an addition of a base class for all GEMM benchmarks, derived classes for Universal GEMM, multi dim GEMM, and GEMM preshuffle. Common functions have been relocated into a common directory. For any derived base classes, only the redefination of the constructor is needed, significantly mitigating the need for duplicated code. Restructure Tile Engine's profiling process This change restructures the profiling process in Tile Engine into a base class for the Profiling and Problem structs. With this all files needed for Tile Engine will have a base struct and files in the gemm/ directory that can be extended for each GEMM variant. Only the Problem and Profiler structs along with the reference functions need to be defined. Profiling functions that are common to each operation have been moved into a common utility file. Adding README back into the gemm directory and integrate new preshuffle functions disabling the gemm tile engine tests and updating preshuffle example to match new tensor_shuffle_utils interface
1ec6ddd to
a2308f4
Compare
Motivation
This PR introduces a restructure for the benchmarking and profiling aspects of CK Tile's Tile Engine, expanding on the groundwork from this previous ROCm/composable_kernel#3434 and outlined in this design document. In PR 3434, to reduce repeated code we implemented:
This refactoring in this PR follows the same process and should greatly reduce the duplicated code present in Tile Engine and make it simpler to add in new operations, increasing scalability.
Technical Details
The files have been refactored around new base structs for benchmarks, profiling and problem descriptions. The new base structs are:
Universal GEMM, Preshuffle GEMM, and Multi-D GEMM all have child classes that will inherit from these base structs overriding only what differs per variant.
All common functions across the benchmarking and profiling files have been moved into newly added common utility files under the commons/ directory. The new utility files are:
Test Plan
I tested using the existing tests for Tile Engine.
Test Result
All tests passed.
Submission Checklist