Add cutlass decode kernel to TritonBench #376

Aya-ZIbra · 2025-08-28T22:38:26Z

Summary: as title

Differential Revision: D80041532

facebook-github-bot · 2025-08-28T22:38:35Z

This pull request was exported from Phabricator. Differential Revision: D80041532

xuzhao9

LGTM!

Summary: as title Reviewed By: sryap Differential Revision: D80041532

facebook-github-bot · 2025-08-29T18:44:59Z

This pull request was exported from Phabricator. Differential Revision: D80041532

Summary: X-link: pytorch/FBGEMM#4853 X-link: facebookresearch/FBGEMM#1875 Add cutlass blackwell FMHA decode kernel implementation to TritonBench benchmarking suite . Reviewed By: sryap Differential Revision: D80041532

facebook-github-bot · 2025-09-23T02:40:43Z

@Aya-ZIbra has exported this pull request. If you are a Meta employee, you can view the originating diff in D80041532.

Summary: X-link: meta-pytorch/tritonbench#376 Pull Request resolved: pytorch#4853 X-link: facebookresearch/FBGEMM#1875 Add cutlass blackwell FMHA decode kernel implementation to TritonBench benchmarking suite . Reviewed By: sryap Differential Revision: D80041532

meta-cla bot added the cla signed label Aug 28, 2025

facebook-github-bot added the fb-exported label Aug 28, 2025

xuzhao9 approved these changes Aug 29, 2025

View reviewed changes

Aya-ZIbra had a problem deploying to docker-s3-upload August 29, 2025 13:06 — with GitHub Actions Failure

Aya-ZIbra temporarily deployed to docker-s3-upload August 29, 2025 13:06 — with GitHub Actions Inactive

Aya-ZIbra force-pushed the export-D80041532 branch from ae2323f to e5efb03 Compare August 29, 2025 18:44

Aya-ZIbra added a commit to Aya-ZIbra/tritonbench that referenced this pull request Aug 29, 2025

Add cutlass decode kernel to TritonBench (meta-pytorch#376)

e5efb03

Summary: as title Reviewed By: sryap Differential Revision: D80041532

Aya-ZIbra temporarily deployed to docker-s3-upload August 29, 2025 18:44 — with GitHub Actions Inactive

Aya-ZIbra had a problem deploying to docker-s3-upload August 29, 2025 18:44 — with GitHub Actions Failure

Add cutlass decode kernel to TritonBench

7e4f294

Summary: X-link: pytorch/FBGEMM#4853 X-link: facebookresearch/FBGEMM#1875 Add cutlass blackwell FMHA decode kernel implementation to TritonBench benchmarking suite . Reviewed By: sryap Differential Revision: D80041532

Aya-ZIbra force-pushed the export-D80041532 branch from e5efb03 to 7e4f294 Compare September 23, 2025 02:40

Aya-ZIbra temporarily deployed to docker-s3-upload September 23, 2025 02:40 — with GitHub Actions Inactive

facebook-github-bot added the meta-exported label Sep 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add cutlass decode kernel to TritonBench #376

Add cutlass decode kernel to TritonBench #376

Uh oh!

Aya-ZIbra commented Aug 28, 2025

Uh oh!

facebook-github-bot commented Aug 28, 2025

Uh oh!

xuzhao9 left a comment

Uh oh!

facebook-github-bot commented Aug 29, 2025

Uh oh!

facebook-github-bot commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add cutlass decode kernel to TritonBench #376

Are you sure you want to change the base?

Add cutlass decode kernel to TritonBench #376

Uh oh!

Conversation

Aya-ZIbra commented Aug 28, 2025

Uh oh!

facebook-github-bot commented Aug 28, 2025

Uh oh!

xuzhao9 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Aug 29, 2025

Uh oh!

facebook-github-bot commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants