Fix upstream profiler for several kernels #2498

anmyachev · 2024-10-16T09:50:27Z

CI status:

https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11366077222 (upstream profiler)
https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11366070178 (legacy profiler)

Signed-off-by: Anatoly Myachev <[email protected]>

anmyachev · 2024-10-16T16:37:12Z

benchmarks/triton_kernels_benchmark/gemm_streamk_benchmark.py

        # benchmark_suit.assert_close(xetla_fn(), torch_fn(), atol=1e-4, rtol=1.0, err_msg='xetla to torch')
-        _, min_ms, max_ms, mean_ms, cv = benchmark_suit.do_bench(
-            xetla_fn, n_warmup=10, n_repeat=10, quantiles=quantiles,
-            kernel_name='gpu::xetla::kernel::gemm_universal_t<dispatch_stream_k')


There was an incorrect kernel name. The error was not visible because at the time of adding, the benchmark was not running in CI.

anmyachev · 2024-10-16T16:38:56Z

benchmarks/triton_kernels_benchmark/benchmark_testing.py

-    assert len(functions) == n_repeat, f"the profiling number not match, {len(functions)}"
    # Make the time to the milliseconds.
-    times = torch.tensor([f.self_device_time_total * 1e-3 for f in functions], dtype=torch.float)
+    times = torch.tensor([sum(map(lambda elem: elem.self_device_time_total, f)) * 1e-3 for f in zip(*all_functions)],


The main problem was that the time of several kernels was not summed up. This affects only "gemm streamk" benchmark.

anmyachev added 4 commits October 16, 2024 09:49

Fix upstream profiler for several kernels

1254163

Signed-off-by: Anatoly Myachev <[email protected]>

fix

59de19a

Signed-off-by: Anatoly Myachev <[email protected]>

fix kernel name for xetla streamk

a0988d2

Signed-off-by: Anatoly Myachev <[email protected]>

fix

b333390

Signed-off-by: Anatoly Myachev <[email protected]>

anmyachev marked this pull request as ready for review October 16, 2024 13:21

vlad-penkin linked an issue Oct 16, 2024 that may be closed by this pull request

[Profiling] Enhancements to the do_bench(...) kineto implementation #2499

Closed

anmyachev commented Oct 16, 2024

View reviewed changes

anmyachev requested review from whitneywhtsang and yudongsi October 16, 2024 16:39

yudongsi approved these changes Oct 17, 2024

View reviewed changes

anmyachev merged commit 700abe3 into main Oct 17, 2024
6 checks passed

anmyachev deleted the amyachev/several-kernels branch October 17, 2024 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix upstream profiler for several kernels #2498

Fix upstream profiler for several kernels #2498

Uh oh!

anmyachev commented Oct 16, 2024 •

edited

Loading

Uh oh!

anmyachev Oct 16, 2024

Uh oh!

anmyachev Oct 16, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix upstream profiler for several kernels #2498

Fix upstream profiler for several kernels #2498

Uh oh!

Conversation

anmyachev commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anmyachev Oct 16, 2024

Choose a reason for hiding this comment

Uh oh!

anmyachev Oct 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anmyachev commented Oct 16, 2024 •

edited

Loading