Skip to content

[None][doc] Add MoE developer guide for fused_moe module#12534

Open
xxi-nv wants to merge 2 commits intoNVIDIA:mainfrom
xxi-nv:add-moe-developer-guide
Open

[None][doc] Add MoE developer guide for fused_moe module#12534
xxi-nv wants to merge 2 commits intoNVIDIA:mainfrom
xxi-nv:add-moe-developer-guide

Conversation

@xxi-nv
Copy link
Collaborator

@xxi-nv xxi-nv commented Mar 25, 2026

Summary

  • Add MOE_DEVELOPER_GUIDE.md in fused_moe/ covering architecture, backends, quantization support matrix, communication, canonical examples, and anti-patterns
  • Add pointer in AGENTS.md Key Files table to guide AI agents to read the doc before modifying MoE code

Test plan

  • Verify markdown renders correctly on GitHub
  • Verify quantization matrix matches can_implement methods in each backend

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive developer guide for Mixture-of-Experts (MoE) components, providing architecture details, backend specifications, and development guidelines.
    • Updated key files reference documentation to include the new MoE guide.

Add MOE_DEVELOPER_GUIDE.md covering architecture, backends, quantization
support matrix, communication, canonical examples, and anti-patterns.
Add AGENTS.md pointer to guide AI agents to read the doc before modifying
MoE code.

Signed-off-by: xxi <xxi@nvidia.com>
@xxi-nv xxi-nv requested review from a team as code owners March 25, 2026 09:25
@xxi-nv xxi-nv requested review from QiJune and nv-guomingz March 25, 2026 09:25
@xxi-nv xxi-nv requested review from kaiyux and leslie-fang25 March 25, 2026 09:26
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

A new documentation file MOE_DEVELOPER_GUIDE.md is added describing Mixture-of-Experts architecture, backends, communication strategies, and development patterns. AGENTS.md is updated to reference this guide in its Key Files table. No code logic, configuration, or public APIs are modified.

Changes

Cohort / File(s) Summary
Documentation References
AGENTS.md
Added reference to new MoE developer guide in the Key Files table.
New MoE Developer Guide
tensorrt_llm/_torch/modules/fused_moe/MOE_DEVELOPER_GUIDE.md
New comprehensive documentation describing MoE fused-layer architecture, transition from old embedded-communication backends (e.g., XXFusedMoE, CutlassFusedMoE) to ConfigurableMoE orchestrator model, component responsibilities, execution flow, supported backends matrix, implementation guidelines for new backends/quantization/communication strategies, anti-patterns, and test locations.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically describes the main change: adding a MoE developer guide to the fused_moe module, following the required format with [None][doc] prefix.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description provides a clear summary of changes and includes a test plan section with verification items, though it lacks the full structure of the required template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/_torch/modules/fused_moe/MOE_DEVELOPER_GUIDE.md`:
- Line 7: The three fenced code blocks in MOE_DEVELOPER_GUIDE.md that currently
start with plain ``` should be annotated with a language tag (e.g., ```text) to
satisfy markdownlint MD040; locate the blocks that contain the diagrams starting
with "Input Hidden States", "ConfigurableMoE", and "routing() → [EPLB]" and
replace their opening fences with ```text so each fenced block includes the
language identifier.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8fbe2798-4698-450e-91a2-31bc2e00dfe0

📥 Commits

Reviewing files that changed from the base of the PR and between 2b5c434 and 83a3623.

📒 Files selected for processing (2)
  • AGENTS.md
  • tensorrt_llm/_torch/modules/fused_moe/MOE_DEVELOPER_GUIDE.md

…GUIDE.md

Add ```text language identifier to three diagram code blocks to satisfy
markdownlint MD040.

Signed-off-by: xxi <xxi@nvidia.com>
@xxi-nv
Copy link
Collaborator Author

xxi-nv commented Mar 25, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #40320 [ run ] triggered by Bot. Commit: 3311834 Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #40320 [ run ] completed with state SUCCESS. Commit: 3311834
/LLM/main/L0_MergeRequest_PR pipeline #31430 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@zhaoyangwang-nvidia
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #40492 [ run ] triggered by Bot. Commit: 3311834 Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #40492 [ run ] completed with state SUCCESS. Commit: 3311834
/LLM/main/L0_MergeRequest_PR pipeline #31580 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants