[None][fix] AutoDeploy: Use tmp folder for the load_moe_align by nvchenghaoz · Pull Request #9101 · NVIDIA/TensorRT-LLM

nvchenghaoz · 2025-11-12T18:50:39Z

Summary by CodeRabbit

Chores
- Updated build directory resolution to use system temporary directories, improving portability and reducing environment variable dependencies for the MOE custom operations build process.

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

coderabbitai · 2025-11-12T18:53:30Z

📝 Walkthrough

Walkthrough

Modified build directory resolution in the MOE alignment module loader to use system temporary directory instead of environment-dependent cache paths. Replaced CACHE_ROOT/build logic with tempfile.gettempdir(). Removed PermissionError fallback and updated module loader to pass the new BUILD_DIR.

Changes

Cohort / File(s)	Summary
Build directory resolution refactoring `tensorrt_llm/_torch/auto_deploy/custom_ops/fused_moe/load_moe_align.py`	Added `tempfile` import; replaced environment-variable-dependent `CACHE_ROOT/build` logic with `tempfile.gettempdir()` for `BUILD_DIR` construction; removed `PermissionError` fallback; now creates directory unconditionally and passes `BUILD_DIR` as `build_directory` parameter to module loader

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Verify implications of switching from cache directory to system temp directory (cleanup, persistence, availability)
Confirm tempfile.gettempdir() behavior is consistent across supported platforms
Ensure unconditional directory creation doesn't introduce permission or collision issues

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is incomplete and contains only a template directive (@coderabbitai summary) with no actual description, test coverage information, or checklist completion.	Complete the PR description by filling in the Description and Test Coverage sections, and confirm the PR checklist items have been reviewed.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main change: using a temporary folder for the load_moe_align module in AutoDeploy, directly matching the code changes shown in the summary.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a751a0 and f5eceba.

📒 Files selected for processing (1)

tensorrt_llm/_torch/auto_deploy/custom_ops/fused_moe/load_moe_align.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

tensorrt_llm/_torch/auto_deploy/custom_ops/fused_moe/load_moe_align.py

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.

Files:

tensorrt_llm/_torch/auto_deploy/custom_ops/fused_moe/load_moe_align.py

**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

tensorrt_llm/_torch/auto_deploy/custom_ops/fused_moe/load_moe_align.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: djns99
Repo: NVIDIA/TensorRT-LLM PR: 6915
File: cpp/tensorrt_llm/kernels/cutlass_kernels/moe_gemm/moe_kernels.cu:4010-4012
Timestamp: 2025-08-14T23:23:27.449Z
Learning: For MOE (Mixture of Experts) code reviews in TensorRT-LLM, avoid repeatedly suggesting finalize fusion validation checks and safety assertions. The user djns99 has indicated these suggestions are repetitive and unwanted across multiple MOE-related changes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (2)

tensorrt_llm/_torch/auto_deploy/custom_ops/fused_moe/load_moe_align.py (2)
7-7: LGTM!

The tempfile import is necessary for the new system temp directory approach.

17-19: Verify security and isolation for multi-user/multi-process scenarios.

Using tempfile.gettempdir() improves portability but introduces potential concerns:

Security: System temp directories are typically world-readable on multi-user systems, allowing other users to see or potentially tamper with compiled extensions.

Isolation: No user-specific or version-specific isolation in the path. Multiple users or processes building different versions concurrently could conflict.

Build collisions: PyTorch's load function may handle concurrent builds, but this should be confirmed.

The previous CACHE_ROOT approach likely provided better user isolation. Please verify that:

The deployment environment is single-user or properly isolated

Concurrent builds from multiple processes/users won't cause issues

Security implications of world-readable compiled extensions are acceptable

Consider adding user or process isolation to the path:
import getpass
BUILD_DIR = os.path.join(
    tempfile.gettempdir(), 
    f"ad_cache_{getpass.getuser()}", 
    "auto_deploy", 
    "fused_moe", 
    "moe_align"
)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

nvchenghaoz · 2025-11-12T19:02:13Z

/bot run

tensorrt-cicd · 2025-11-12T19:08:05Z

PR_Github #24332 [ run ] triggered by Bot. Commit: f5eceba

yibinl-nvidia · 2025-11-12T19:12:29Z

Thank you @nvchenghaoz for making this change to strengthen TRTLLM security level.

Fridah-nv

LGTM, closes #9017

tensorrt-cicd · 2025-11-12T22:59:47Z

PR_Github #24332 [ run ] completed with state SUCCESS. Commit: f5eceba
/LLM/main/L0_MergeRequest_PR pipeline #18361 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

…#9101) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

use tmp folder for the autodeploy compilation

f5eceba

Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

nvchenghaoz requested a review from yibinl-nvidia November 12, 2025 18:50

nvchenghaoz self-assigned this Nov 12, 2025

nvchenghaoz requested a review from a team as a code owner November 12, 2025 18:50

nvchenghaoz requested a review from Fridah-nv November 12, 2025 18:50

github-project-automation bot added this to AutoDeploy Board Nov 12, 2025

github-project-automation bot moved this to Backlog in AutoDeploy Board Nov 12, 2025

yibinl-nvidia approved these changes Nov 12, 2025

View reviewed changes

Fridah-nv approved these changes Nov 12, 2025

View reviewed changes

github-project-automation bot moved this from Backlog to In review in AutoDeploy Board Nov 12, 2025

Fridah-nv linked an issue Nov 12, 2025 that may be closed by this pull request

[Bug]: AutoDeploy: avoid aot compilation of auto_deploy kernels unless kernel is needed #9017

Closed

nvchenghaoz enabled auto-merge (squash) November 12, 2025 22:56

nvchenghaoz merged commit f1d637e into NVIDIA:main Nov 12, 2025
7 checks passed

github-project-automation bot moved this from In review to Done in AutoDeploy Board Nov 12, 2025

zheyuf pushed a commit to zheyuf/TensorRT-LLM that referenced this pull request Nov 19, 2025

[None][fix] AutoDeploy: Use tmp folder for the load_moe_align (NVIDIA…

470346f

…#9101) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][fix] AutoDeploy: Use tmp folder for the load_moe_align#9101

[None][fix] AutoDeploy: Use tmp folder for the load_moe_align#9101
nvchenghaoz merged 1 commit intoNVIDIA:mainfrom
nv-auto-deploy:chenghao/moe_tmp_1112

nvchenghaoz commented Nov 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 12, 2025

Walkthrough

Changes

Estimated code review effort

Uh oh!

nvchenghaoz commented Nov 12, 2025

Uh oh!

tensorrt-cicd commented Nov 12, 2025

Uh oh!

yibinl-nvidia commented Nov 12, 2025

Uh oh!

Fridah-nv left a comment

Uh oh!

tensorrt-cicd commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nvchenghaoz commented Nov 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 12, 2025

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

nvchenghaoz commented Nov 12, 2025

Uh oh!

tensorrt-cicd commented Nov 12, 2025

Uh oh!

yibinl-nvidia commented Nov 12, 2025

Uh oh!

Fridah-nv left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nvchenghaoz commented Nov 12, 2025 •

edited by coderabbitai bot

Loading