Skip to content

Commit ade5449

Browse files
authored
Tianxing/rope latent attention (#731)
MLA rope fusion impl
1 parent 5bb32e8 commit ade5449

File tree

5 files changed

+1569
-0
lines changed

5 files changed

+1569
-0
lines changed

.github/workflows/amd_perf_kernel_Integration_tests.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,7 @@ jobs:
122122
pytest -vvv ./python/perf-kernels/fused_moe/moe-gemm.py
123123
sh ./python/perf-kernels/streamk/utils/unittest.sh
124124
pytest -vvv ./python/perf-kernels/multreduce_matmul_kernel.py
125+
pytest -vvv ./python/perf-kernels/MLA_decode_rope.py
125126
- name: Run Perf Kernels Benchmark
126127
run: |
127128
python ./python/perf-kernels/flash-attention.py
@@ -130,3 +131,4 @@ jobs:
130131
python ./python/perf-kernels/rmsnorm.py --mode bwd
131132
python ./python/perf-kernels/layernorm.py
132133
python ./python/perf-kernels/multreduce_matmul_kernel.py bench
134+
python ./python/perf-kernels/MLA_decode_rope.py

0 commit comments

Comments
 (0)