Skip to content

Commit e8dca1f

Browse files
authored
Fix redundant kernels in moe (#1428)
<!-- .github/pull_request_template.md --> ## 📌 Description <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [ ] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [ ] I have installed the hooks with `pre-commit install`. - [ ] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. -->
1 parent 9756433 commit e8dca1f

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

flashinfer/fused_moe/core.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1006,15 +1006,15 @@ def trtllm_fp4_block_scale_moe_op(
10061006

10071007
# workspace buffers required by trtllm-gen
10081008
if topk_ids is None:
1009-
topk_ids = torch.zeros(
1009+
topk_ids = torch.empty(
10101010
num_tokens, top_k, dtype=torch.int32, device=hidden_states.device
10111011
)
10121012
if expert_weights is None:
1013-
expert_weights = torch.zeros(
1013+
expert_weights = torch.empty(
10141014
num_tokens, top_k, dtype=routing_dtype, device=hidden_states.device
10151015
)
10161016
if output is None:
1017-
output = torch.zeros(
1017+
output = torch.empty(
10181018
num_tokens,
10191019
hidden_size,
10201020
dtype=torch.bfloat16,

0 commit comments

Comments
 (0)