Skip to content

Commit 33b62a6

Browse files
authored
perf: cache get_compute_capability (#1456)
<!-- .github/pull_request_template.md --> ## 📌 Description Cache compute capability to reduce repeat cpu overload mentioned in #1425 (comment) Results reproduced from https://gist.github.com/yzh119/d9bf2abbb667abcbb806979f4bbea633 : ```bash before the fix w/o CUDAGraph 0.0054492950439453125 w/ CUDAGraph 0.002916574478149414 after the fix w/o CUDAGraph 0.0038330554962158203 w/ CUDAGraph 0.0030286312103271484 ``` ## 🔍 Related Issues #1425 ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> Co-authored-by: Yaxing Cai <[email protected]>
1 parent 7c45412 commit 33b62a6

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

flashinfer/utils.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
limitations under the License.
1515
"""
1616

17+
import functools
1718
import math
1819
import os
1920
from enum import Enum
@@ -207,6 +208,7 @@ def canonicalize_torch_dtype(dtype: Union[torch.dtype, str]) -> torch.dtype:
207208
)
208209

209210

211+
@functools.cache
210212
def get_compute_capability(device: torch.device) -> Tuple[int, int]:
211213
if device.type != "cuda":
212214
raise ValueError("device must be a cuda device")

0 commit comments

Comments
 (0)