Skip to content

Conversation

Aya-ZIbra
Copy link
Contributor

Summary:

  1. Reduce pipeline stages to avoid exceeding smem limit
  2. Add static_assert to make sure smem capacity violation is raised during compilation rather than runtime
  3. Select the TMEM intrinsics based on sizeof(Element).
  4. Update unittest to include bf16
  5. Also label decode kernel test name with their corresponding test parameters.

Differential Revision: D82991495

Copy link

netlify bot commented Sep 23, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 0887844
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68d46815002a910008efbd3d
😎 Deploy Preview https://deploy-preview-4916--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Sep 23, 2025
@facebook-github-bot
Copy link
Contributor

@Aya-ZIbra has exported this pull request. If you are a Meta employee, you can view the originating diff in D82991495.

Summary:

X-link: facebookresearch/FBGEMM#1940

1. Reduce pipeline stages to avoid exceeding smem limit
2. Add static_assert to make sure smem capacity violation is raised during compilation rather than runtime
3. Select the TMEM intrinsics based on sizeof(Element).
4. Update unittest to include bf16
5. Also label decode kernel test name with their corresponding test parameters.

Differential Revision: D82991495
@facebook-github-bot
Copy link
Contributor

@Aya-ZIbra has exported this pull request. If you are a Meta employee, you can view the originating diff in D82991495.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants