Skip to content

[Feature]: Make sharing of activation_type across SW layers more robust #9236

@nzmora-nvidia

Description

@nzmora-nvidia

🚀 The feature, motivation and pitch

C++, Python and Python MoE layer all share the definition of ActivationType.
Currently this is done thru redefinition which is fragile and can break when adding new activation function types.

tensorrt_llm/_torch/utils.py
cpp/tensorrt_llm/kernels/cutlass_kernels/include/common.h
=>
tensorrt_llm/layers/moe.py
cpp/tensorrt_llm/kernels/cutlass_kernels/moe_gemm/moe_kernels.cu

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Assignees

Labels

feature requestNew feature or request. This includes new model, dtype, functionality support

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions