Skip to content

Commit 918e62b

Browse files
committed
tweak bound_m
Signed-off-by: Bill Nell <[email protected]>
1 parent 800dde1 commit 918e62b

File tree

1 file changed

+3
-4
lines changed

1 file changed

+3
-4
lines changed

vllm/model_executor/layers/fused_moe/pplx_dispatch_combine.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -123,10 +123,9 @@ def combine(
123123
apply_router_weight_on_input: bool,
124124
) -> None:
125125
# This argument is optional
126-
#num_tokens = output.shape[0] # M
127-
#bound_m = torch.tensor([num_tokens], dtype=torch.uint32,
128-
# device=fused_expert_output.device)
129-
bound_m = None
126+
num_tokens = output.shape[0] # M
127+
bound_m = torch.tensor([num_tokens], dtype=torch.uint32,
128+
device=fused_expert_output.device)
130129

131130
assert output.shape[0] <= self.max_num_tokens
132131
assert output.shape[1] == fused_expert_output.shape[-1]

0 commit comments

Comments
 (0)