Skip to content

Commit dae1a0f

Browse files
authored
bugfix: trtllm-gen fmha sm101 and sm100 compatibility (#1631)
<!-- .github/pull_request_template.md --> ## 📌 Description sm100 device fallback on sm101 kernels for trtllm-gen fmha. <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. -->
1 parent b4b9e22 commit dae1a0f

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

include/flashinfer/trtllm/fmha/fmhaKernels.cuh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ using flashinfer::trtllm_cubin_loader::getCubin;
5252
constexpr bool isSMCompatible(int gpuSM, int kernelSM) {
5353
if (gpuSM == kSM_103) {
5454
return kernelSM == kSM_100f || kernelSM == kSM_103;
55+
} else if (gpuSM == kSM_100) {
56+
return kernelSM == kSM_100f || kernelSM == kSM_100;
5557
}
5658

5759
return gpuSM == kernelSM;

0 commit comments

Comments
 (0)