Skip to content

[CK_TILE] Restructure Tile Engine's benchmarking and profiling#4769

Open
arai713 wants to merge 2 commits intodevelopfrom
ck/arai/ck_tile/tile_engine_restructure
Open

[CK_TILE] Restructure Tile Engine's benchmarking and profiling#4769
arai713 wants to merge 2 commits intodevelopfrom
ck/arai/ck_tile/tile_engine_restructure

Conversation

@arai713
Copy link
Contributor

@arai713 arai713 commented Feb 20, 2026

Motivation

This PR introduces a restructure for the benchmarking and profiling aspects of CK Tile's Tile Engine, expanding on the groundwork from this previous ROCm/composable_kernel#3434 and outlined in this design document. In PR 3434, to reduce repeated code we implemented:

  • Base class that centralizes common functionality and provides a default implementation (Universal GEMM)
  • Child classes for GEMM variants override virtual functions to handle variant-specific behavior

This refactoring in this PR follows the same process and should greatly reduce the duplicated code present in Tile Engine and make it simpler to add in new operations, increasing scalability.

Technical Details

The files have been refactored around new base structs for benchmarks, profiling and problem descriptions. The new base structs are:

  • GemmProblem
  • GemmBenchmark
  • GemmProfiler

Universal GEMM, Preshuffle GEMM, and Multi-D GEMM all have child classes that will inherit from these base structs overriding only what differs per variant.
All common functions across the benchmarking and profiling files have been moved into newly added common utility files under the commons/ directory. The new utility files are:

  • utils.hpp: common functions for the benchmarking and profiling process
  • benchmark_utils.py: common utility functions for the benchmark generation

Test Plan

I tested using the existing tests for Tile Engine.

Test Result

All tests passed.

Submission Checklist

@arai713 arai713 requested a review from a team as a code owner February 20, 2026 19:19
@arai713 arai713 changed the title Restructure Tile Engine's benchmarking process Restructure Tile Engine's benchmarking and profiling Feb 20, 2026
@arai713 arai713 changed the title Restructure Tile Engine's benchmarking and profiling [CK_TILE] Restructure Tile Engine's benchmarking and profiling Feb 23, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures the Tile Engine's benchmarking and profiling infrastructure to reduce code duplication and simplify the addition of new operations. The refactoring introduces a base class architecture with common utilities extracted into shared files, while maintaining backward compatibility with existing tests.

Changes:

  • Introduced base profiler class (GemmProfiler) and common utility files to eliminate duplicated code across GEMM variants
  • Restructured GEMM universal, preshuffle, and multi_d implementations to inherit from the new base classes
  • Consolidated Python benchmarking utilities into a shared module accessible by all GEMM variants

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated no comments.

Show a summary per file
File Description
gemm_universal_profiler.hpp New universal GEMM profiler inheriting from base GemmProfiler class
gemm_profiler.hpp New base profiler template class providing common profiling functionality
common/utils.hpp Common utility functions and structs (Metric, PerformanceResult, KernelInstance, Setting)
gemm_benchmark.hpp Base problem definition and comparison utilities for GEMM operations
gemm_common.hpp Shared kernel traits and argument parser creation
common/benchmark_utils.py Consolidated Python utilities for running kernels and exporting results
gemm_benchmark.py Base Python class for GEMM benchmarking with kernel discovery and execution

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch from 18ab3a0 to f0a234b Compare February 27, 2026 10:32
template <typename Layout>
constexpr auto is_row_major(Layout)
{
return ck_tile::bool_constant<std::is_same_v<Layout, ck_tile::tensor_layout::gemm::RowMajor>>{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just return the boolean value if std::is_same_v instead of wrapping it inside of a bool_constant?

Copy link
Contributor Author

@arai713 arai713 Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other commonly used functions, like get_default_stride, that call is_row_major are expecting the bool_constant return type.

@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch 2 times, most recently from a88950e to c18b8a0 Compare March 2, 2026 21:22
Copy link
Contributor

@ThruptiRajLakshmanaGowda ThruptiRajLakshmanaGowda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch from c18b8a0 to 2c2dfd7 Compare March 4, 2026 18:46
@arai713 arai713 requested a review from AviralGoelAMD March 4, 2026 23:15
@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch 2 times, most recently from 6727d4e to 6b9cdb4 Compare March 6, 2026 17:10
@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch 3 times, most recently from 5d97818 to 42c57c9 Compare March 10, 2026 16:54
Copy link
Contributor

@AviralGoelAMD AviralGoelAMD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now!

@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch 3 times, most recently from cf3ac50 to 1ec6ddd Compare March 17, 2026 16:34
arai713 added 2 commits March 18, 2026 12:47
This change restructures the Benchmark structs into 3 files.
There is an addition of a base class for all GEMM benchmarks, derived classes for
Universal GEMM, multi dim GEMM, and GEMM preshuffle. Common functions have been relocated
into a common directory. For any derived base classes, only the redefination of the
constructor is needed, significantly mitigating the need for duplicated code.

Restructure Tile Engine's profiling process

This change restructures the profiling process in Tile Engine into
a base class for the Profiling and Problem structs. With this all files
needed for Tile Engine will have a base struct and files in the gemm/
directory that can be extended for each GEMM variant. Only the Problem
and Profiler structs along with the reference functions need to be
defined. Profiling functions that are common to each operation have
been moved into a common utility file.

Adding README back into the gemm directory and integrate new preshuffle functions

disabling the gemm tile engine tests and updating preshuffle example to match new tensor_shuffle_utils interface
@arai713 arai713 force-pushed the ck/arai/ck_tile/tile_engine_restructure branch from 1ec6ddd to a2308f4 Compare March 18, 2026 16:47
@arai713 arai713 requested review from a team as code owners March 18, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants