Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions vllm/_custom_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
logger = init_logger(__name__)

current_platform.import_core_kernels()
supports_moe_ops = current_platform.try_import_moe_kernels()
current_platform.import_moe_kernels()

if TYPE_CHECKING:

Expand Down Expand Up @@ -1921,7 +1921,7 @@ def moe_wna16_marlin_gemm(
)


if supports_moe_ops and hasattr(torch.ops._moe_C, "marlin_gemm_moe"):
if hasattr(torch.ops, "_moe_C") and hasattr(torch.ops._moe_C, "marlin_gemm_moe"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This is a great change to make the check more robust against import failures of the _moe_C extension.

I noticed that similar potentially unsafe checks exist for the _C extension throughout this file. For example, hasattr(torch.ops._C, "gptq_gemm") on line 474. If vllm._C fails to import, torch.ops._C will not exist, and this will raise an AttributeError.

It would be beneficial to apply the same two-step check pattern (hasattr(torch.ops, "_C") and hasattr(torch.ops._C, "...")) to all such occurrences for consistency and robustness. Here are the locations I found:

  • line 474
  • line 512
  • line 633
  • line 653
  • line 700
  • line 1330
  • line 2310
  • line 2324
  • line 2346
  • line 2373

Since this is a follow-up PR, addressing this in another follow-up would be appropriate.


@register_fake("_moe_C::marlin_gemm_moe")
def marlin_gemm_moe_fake(
Expand Down
5 changes: 1 addition & 4 deletions vllm/platforms/interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,14 +175,11 @@ def import_core_kernels(cls) -> None:
logger.warning("Failed to import from vllm._C: %r", e)

@classmethod
def try_import_moe_kernels(cls) -> bool:
def import_moe_kernels(cls) -> None:
"""Import any platform-specific MoE kernels."""
with contextlib.suppress(ImportError):
import vllm._moe_C # noqa: F401

return True
return False

@classmethod
def get_vit_attn_backend(cls, head_size: int, dtype: torch.dtype) -> "_Backend":
from vllm.attention.backends.registry import _Backend
Expand Down