Skip to content

Commit 800dde1

Browse files
committed
varun's fixes
Signed-off-by: Bill Nell <[email protected]>
1 parent 3433b73 commit 800dde1

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

vllm/model_executor/layers/fused_moe/pplx_dispatch_combine.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -123,9 +123,10 @@ def combine(
123123
apply_router_weight_on_input: bool,
124124
) -> None:
125125
# This argument is optional
126-
num_tokens = output.shape[0] # M
127-
bound_m = torch.tensor([num_tokens], dtype=torch.uint32,
128-
device=fused_expert_output.device)
126+
#num_tokens = output.shape[0] # M
127+
#bound_m = torch.tensor([num_tokens], dtype=torch.uint32,
128+
# device=fused_expert_output.device)
129+
bound_m = None
129130

130131
assert output.shape[0] <= self.max_num_tokens
131132
assert output.shape[1] == fused_expert_output.shape[-1]

0 commit comments

Comments
 (0)