You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bugfix: Verify num_experts greater or equal to local_experts + offset (#1469)
## 📌 Description
Verify that `num_experts >= local_num_experts + local_expert_offset` to
avoid Illegal memory access.
Currently when calling `fused_moe.trtllm_fp8_per_tensor_scale_moe` with
`local_num_experts+local_expert_offset > num_experts`, it results in a
`CUDA: Illegal memory access`.
Signed-off-by: Amir Klein <[email protected]>
0 commit comments