Skip to content

Commit 69cc99d

Browse files
Add restriction conditions to the ApplyTopPTopK operator (#3254)
### What this PR does / why we need it? Add restriction conditions to the ApplyTopPTopK operator : 1 <= K <=1024 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@releases/v0.11.0 --------- Signed-off-by: SunnyLee219 <[email protected]>
1 parent 0654868 commit 69cc99d

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm_ascend/sample/sampler.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,8 @@ def _apply_top_k_top_p(
2929
p: torch.Tensor,
3030
) -> torch.Tensor:
3131
# npu_top_k_top_p uses the operator aclnnApplyTopKTopP, but aclnnApplyTopKTopP currently does not support 310P
32-
if not is_310p() and p is not None and k is not None:
32+
if not is_310p() and p is not None and k is not None and 1 <= int(
33+
k.max()) <= 1024:
3334
# npu_top_k_top_p's parameter order is (logits, p, k), not (logits, k, p)
3435
return torch_npu.npu_top_k_top_p(logits, p, k)
3536

0 commit comments

Comments
 (0)