Skip to content

add control over the number of SMs to be used by the kernel #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Amir-19
Copy link
Contributor

@Amir-19 Amir-19 commented May 5, 2025

added optional parameter max_sm_count to the kernel to specify the number of SMs to use in the kernel.

@Amir-19 Amir-19 force-pushed the parametrize_kernel branch 4 times, most recently from a6cd433 to 5c9dead Compare May 6, 2025 06:31
@Amir-19 Amir-19 force-pushed the parametrize_kernel branch from 5c9dead to 7d51605 Compare May 20, 2025 21:27
@Amir-19 Amir-19 force-pushed the parametrize_kernel branch from 7d51605 to 0487e8a Compare May 20, 2025 21:31
Copy link
Collaborator

@nandor nandor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good addition, however the parameter would be better placed on the calls to dispatch/combine themselves. This way, a single instantiation of all-to-all can be re-used in multiple contexts (prefill/decode/cuda-graph/no-cuda-graph).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants