Skip to content
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 20 additions & 16 deletions .github/workflows/triton-benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ on:
description: Run name
type: string
default: "Triton benchmarks"
skip_benchmarks:
description: List of benchmarks to skip
type: string
default: ""
schedule:
- cron: "5 23 * * *"
pull_request:
Expand Down Expand Up @@ -112,7 +116,7 @@ jobs:
python setup.py install

- name: Run Triton Softmax kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'fused_softmax.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python fused_softmax.py --reports $REPORTS
Expand All @@ -121,7 +125,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/softmax-performance.csv $REPORTS/softmax-xetla-report.csv --benchmark softmax --compiler xetla --param_cols "N" --tflops_col XeTLA-TFlops --hbm_col "XeTLA-GB/s" --tag $TAG

- name: Run Triton GEMM kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_benchmark.py') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contains(inputs.skip_benchmarks, 'gemm_benchmark.py') returns true if the skip list contains gemm_benchmark.py_default or gemm_benchmark.py_advanced. Is it expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is true if the input is a JSON list, see the test run with ['softmax_kernel.py','gemm_benchmark.py']: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11788616521/job/32836016215

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How contains knows this is json and not a string? The input type iof skip_benchmarks is string and there is no fromJSON...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How contains knows this is json and not a string? The input type iof skip_benchmarks is string and there is no fromJSON...

I don't know, let's add this for clarity.

Copy link
Contributor Author

@kwasd kwasd Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run: |
cd benchmarks/triton_kernels_benchmark
python gemm_benchmark.py --reports $REPORTS
Expand All @@ -132,7 +136,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/matmul-performance-base.csv $REPORTS/gemm-xetla-report.csv --benchmark gemm --compiler xetla --param_cols "B,M,K,N" --tflops_col XeTLA-TFlops --hbm_col "XeTLA-GB/s" --tag $TAG

- name: Run Triton GEMM kernel benchmark - default path
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_benchmark.py_default') }}
run: |
cd benchmarks/triton_kernels_benchmark
# Default path:
Expand All @@ -148,7 +152,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/matmul-performance-default-path.csv $REPORTS/gemm-triton-default-report.csv --benchmark gemm --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton GEMM kernel benchmark - advanced path
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_benchmark.py_advanced') }}
run: |
cd benchmarks/triton_kernels_benchmark
# Advanced path:
Expand All @@ -164,7 +168,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/matmul-performance-adv-path.csv $REPORTS/gemm-triton-advanced-report.csv --benchmark gemm --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton GEMM (A@B^t) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_benchmark.py_abt') }}
run: |
cd benchmarks/triton_kernels_benchmark
TRANSPOSE_B=1 python gemm_benchmark.py --reports $REPORTS
Expand All @@ -175,7 +179,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/matmul-performance-bt.csv $REPORTS/gemm-bt-onednn-report.csv --benchmark gemm-bt --compiler onednn --param_cols "B,M,K,N" --tflops_col onednn-TFlops --hbm_col "onednn-GB/s" --tag $TAG

- name: Run Triton GEMM (A^t@B) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_benchmark.py_atb') }}
run: |
cd benchmarks/triton_kernels_benchmark
TRANSPOSE_A=1 python gemm_benchmark.py --reports $REPORTS
Expand All @@ -186,47 +190,47 @@ jobs:
python ../../scripts/build_report.py $REPORTS/matmul-performance-at.csv $REPORTS/gemm-at-onednn-report.csv --benchmark gemm-at --compiler onednn --param_cols "B,M,K,N" --tflops_col onednn-TFlops --hbm_col "onednn-GB/s" --tag $TAG

- name: Run Triton GEMM (stream-k) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_streamk_benchmark.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python gemm_streamk_benchmark.py --reports $REPORTS
source ../../scripts/capture-hw-details.sh
python ../../scripts/build_report.py $REPORTS/matmul-streamk-performance.csv $REPORTS/gemm-streamk-triton-report.csv --benchmark gemm-streamk --compiler triton --param_cols "M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton GEMM (split-k) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_splitk_benchmark.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python gemm_splitk_benchmark.py --reports $REPORTS
source ../../scripts/capture-hw-details.sh
python ../../scripts/build_report.py $REPORTS/matmul-splitk-performance.csv $REPORTS/gemm-splitk-triton-report.csv --benchmark gemm-splitk --compiler triton --param_cols "M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton GEMM + PreOp (exp) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_preop_exp_benchmark.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python gemm_preop_exp_benchmark.py --reports $REPORTS
source ../../scripts/capture-hw-details.sh
python ../../scripts/build_report.py $REPORTS/matmul-performance-preop-exp.csv $REPORTS/gemm-preop-exp-triton-report.csv --benchmark gemm-preop-exp --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton GEMM + PostOp (Gelu) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_postop_gelu_benchmark.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python gemm_postop_gelu_benchmark.py --reports $REPORTS
source ../../scripts/capture-hw-details.sh
python ../../scripts/build_report.py $REPORTS/matmul-performance-postop-gelu.csv $REPORTS/gemm-postop-gelu-triton-report.csv --benchmark gemm-postop-gelu --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton GEMM + PostOp (add matrix) kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'gemm_postop_addmatrix_benchmark.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python gemm_postop_addmatrix_benchmark.py --reports $REPORTS
source ../../scripts/capture-hw-details.sh
python ../../scripts/build_report.py $REPORTS/matmul-performance-postop-addmatrix.csv $REPORTS/gemm-postop-addmatrix-triton-report.csv --benchmark gemm-postop-addmatrix --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton FA kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'flash_attention_fwd_benchmark.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python flash_attention_fwd_benchmark.py --reports $REPORTS
Expand All @@ -236,7 +240,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/attn-performance.csv $REPORTS/attn-xetla-report.csv --benchmark attn --compiler xetla --param_cols "Z,H,N_CTX,D_HEAD,CAUSAL" --tflops_col XeTLA-TFlops --hbm_col "XeTLA-GB/s" --tag $TAG

- name: Run Triton FA kernel benchmark - default path
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'flash_attention_fwd_benchmark.py_default') }}
run: |
cd benchmarks/triton_kernels_benchmark
TRITON_INTEL_ADVANCED_PATH=0 \
Expand All @@ -249,7 +253,7 @@ jobs:
python ../../scripts/build_report.py $REPORTS/attn-performance.csv $REPORTS/attn-triton-default-report.csv --benchmark attn --compiler triton --param_cols "Z,H,N_CTX,D_HEAD,CAUSAL" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Triton FA kernel benchmark - advanced path
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'flash_attention_fwd_benchmark.py_advanced') }}
run: |
cd benchmarks/triton_kernels_benchmark
TRITON_INTEL_ADVANCED_PATH=1 \
Expand All @@ -262,15 +266,15 @@ jobs:
python ../../scripts/build_report.py $REPORTS/attn-performance.csv $REPORTS/attn-triton-advanced-report.csv --benchmark attn --compiler triton --param_cols "Z,H,N_CTX,D_HEAD,CAUSAL" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run Prefix Sums kernel benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'prefix_sums.py') }}
run: |
cd benchmarks/triton_kernels_benchmark
python prefix_sums.py --reports $REPORTS
source ../../scripts/capture-hw-details.sh
python ../../scripts/build_report.py $REPORTS/prefix-sums.csv $REPORTS/prefix_sums-triton-report.csv --benchmark prefix_sums --compiler triton --param_cols "N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG

- name: Run micro benchmark
if: ${{ steps.install.outcome == 'success' && !cancelled() }}
if: ${{ steps.install.outcome == 'success' && !cancelled() && !contains(inputs.skip_benchmarks, 'micro_benchmarks') }}
run: |
cd benchmarks/micro_benchmarks
python run_benchmarks.py --reports $REPORTS
Expand Down
Loading