Skip to content

Conversation

v0i0
Copy link

@v0i0 v0i0 commented Oct 8, 2025

[ghstack-poisoned]
v0i0 added a commit that referenced this pull request Oct 8, 2025
@v0i0 v0i0 temporarily deployed to docker-s3-upload October 8, 2025 17:27 — with GitHub Actions Inactive
@v0i0 v0i0 temporarily deployed to docker-s3-upload October 8, 2025 17:27 — with GitHub Actions Inactive
@meta-cla meta-cla bot added the cla signed label Oct 8, 2025
@v0i0 v0i0 requested a review from PaulZhang12 October 8, 2025 17:27
@xuzhao9
Copy link
Contributor

xuzhao9 commented Oct 8, 2025

Just curious, can you give more details on why it makes block math more robust?

@v0i0
Copy link
Author

v0i0 commented Oct 8, 2025

Just curious, can you give more details on why it makes block math more robust?

for M <= 8 * num_sm, e.g. M=512, the current code will yield BLOCK_SIZE_M=0, and then PARTIAL_SIZE's computation will cause an exception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants