-
Notifications
You must be signed in to change notification settings - Fork 160
Deprecate ModelOpt custom docker and directly use TRT-LLM / PyTorch / TRT docker #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughRemoved repository Docker artifacts and ModelOpt Docker image references; updated installation docs to prefer TensorRT-LLM/TensorRT images and local virtualenvs; refactored GitLab CI to tag/manual gating and reorganized GPU/example jobs; moved GitHub Actions env setup into runtime steps; small example/test fixes (JSON list wrapping, removed one sparsity test case); removed CODEOWNERS Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant GH as GitHub Actions
participant Runner as Container
participant Setup as "Setup environment variables" step
participant Tests as "Run example tests"
GH->>Runner: start example-tests-pr job
Note over Runner: container env no longer sets LD_LIBRARY_PATH/PATH
GH->>Setup: run "Setup environment variables" step
Setup-->>GH: append LD_LIBRARY_PATH and PATH to $GITHUB_ENV
GH->>Tests: run example tests with updated env
Tests-->>GH: return results
sequenceDiagram
autonumber
participant GL as GitLab CI
participant Rules as Rules (tag/manual)
participant Multi as .multi-gpu-tests-default
participant Example as example-torch / example-trtllm / example-onnx
GL->>Rules: evaluate rules (tag =~ /^\d+\.\d+\.\d$/ or manual)
alt matched
GL->>Multi: start multi-GPU default (before_script sets env, installs git-lfs)
Multi->>Example: spawn example jobs (parallel matrices)
Example-->>GL: job results (some allow_failure)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (13)
CONTRIBUTING.md (1)
14-14
: LGTM. Link-only replacement is fine.Optional: add a short hint about using the TensorRT-LLM release images as the recommended container baseline to keep this consistent with README.
examples/diffusers/README.md (4)
15-15
: Fix typo: “cahce” → “cache”.Apply:
-| Support Matrix | View the support matrix to see quantization/cahce diffusion compatibility and feature availability across different models | [[Link](#support-matrix)] | [[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/1_quantization.html)] | +| Support Matrix | View the support matrix to see quantization/cache diffusion compatibility and feature availability across different models | [[Link](#support-matrix)] | [[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/1_quantization.html)] |
374-375
: Grammar: “achieve” → “achieved”.Also consider “is achieved” → “is achieved by”.
- Significant speedup is achieve through skipping certain blocks at the specific steps. + Significant speedup is achieved by skipping certain blocks at specific steps.
379-379
: Nit: trailing double period.-... would.. +... would.
480-480
: Fix typo: “requirments.txt” → “requirements.txt”.-pip install -r eval/requirments.txt +pip install -r eval/requirements.txtCHANGELOG.rst (1)
4-4
: Finalize release date before publishing 0.37.Replace “2025-09-xx” with the actual date (e.g., 2025-09-19) at release cut.
examples/llm_qat/README.md (2)
32-33
: LGTM; optional clarifications.
- Consider adding a short note that NVIDIA NGC “tensorrt-llm/release:” images are the recommended base with ModelOpt included.
- Link to the exact section in installation docs that covers env PATH/LD_LIBRARY_PATH inside containers.
314-314
: Naming consistency: use “TensorRT-LLM” (or TRT-LLM) consistently.“TRTLLM” (no dash) appears here; prefer “TRT-LLM”.
-# To run QAT model with TRTLLM, run: +# To run QAT model with TRT-LLM, run:examples/llm_ptq/README.md (1)
35-36
: LGTM; mirrors QAT README update.Optional: add a one-liner pointing to recommended NGC image (tensorrt-llm/release:).
tests/examples/README.md (1)
12-18
: LGTM; add a note on auth to avoid rate limits.Consider mentioning HF auth (HF_TOKEN/HF_HOME) for tests that pull from Hugging Face to reduce flakiness.
examples/onnx_ptq/README.md (1)
27-27
: Clarify example-scoped Dockerfile and update README.
examples/onnx_ptq/docker/Dockerfile exists; README (examples/onnx_ptq/README.md lines 25–33) still links ./docker/Dockerfile and invokes ./docker/build.sh. Update the README to explicitly state this Dockerfile is specific to the ONNX PTQ example (not repo-wide); if repo-level Docker is deprecated, replace these instructions with the recommended NGC image guidance or document the deprecation.docs/source/getting_started/_installation_for_Linux.rst (2)
48-48
: Minor: Wording improvement needed.The phrase "Alternative NVIDIA docker images" could be clearer.
- **Alternative NVIDIA docker images** + **Alternative NVIDIA Docker images**
89-92
: Verify consistency with docker image deprecation.The text mentions "If you build and use ModelOpt's docker image" but the PR is deprecating the custom ModelOpt docker. This section should be updated to reflect the new reality.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all -optional dependencies pre-installed. -If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies. +If you use the suggested TensorRT-LLM or other NVIDIA docker images, ModelOpt may be pre-installed with some of the below optional dependencies.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
.github/CODEOWNERS
(0 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)
💤 Files with no reviewable changes (4)
- .github/CODEOWNERS
- docker/build.sh
- docker/README.md
- docker/Dockerfile
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-16T20:14:34.768Z
Learnt from: realAsma
PR: NVIDIA/TensorRT-Model-Optimizer#318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:191-191
Timestamp: 2025-09-16T20:14:34.768Z
Learning: The TensorRT-Model-Optimizer project only supports PyTorch >= 2.6, so using the `weights_only` parameter in torch.load calls is acceptable and doesn't require backward compatibility handling.
Applied to files:
CHANGELOG.rst
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (14)
examples/diffusers/README.md (1)
40-40
: LGTM.Matches repo-wide guidance to centralize install steps in the installation docs.
README.md (2)
64-64
: LGTM.Using “-U” to upgrade is a helpful tweak.
67-67
: LGTM.Wording is clearer (“latest features”).
examples/llm_ptq/README.md (1)
127-135
: Staleness check on footnote 5 — no update required (keep "TensorRT‑LLM v0.17 or later").
v0.17.0 introduced Blackwell and NVFP4 support; subsequent 0.x releases exist and no v1.x TensorRT‑LLM release was found — optionally change to "v0.17.0 or later" for precision. (github.com)CHANGELOG.rst (1)
9-9
: Align TensorRT-LLM version strings across the repoMixed references to "1.1.0rc2" and "1.1.0rc2.post2" were found (CHANGELOG.rst New Features, README/docs table, example scripts/notebooks using nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2 — e.g. examples/llm_ptq scripts/huggingface_example.sh, plus CI/version checks). Choose a single canonical tag (match CI/packaging), update version checks and docker image tags/docs, and re-run a grep to confirm no remaining mismatches.
docs/source/getting_started/_installation_for_Linux.rst (3)
5-26
: LGTM! Great addition of system requirements table.The new system requirements table provides clear guidance on supported OS, architecture, Python versions, and dependencies. This aligns well with the shift to TensorRT-LLM docker images.
33-37
: LGTM! Clear migration to TensorRT-LLM docker image.The updated guidance properly directs users to the TensorRT-LLM docker image instead of a custom ModelOpt docker, which aligns with the PR objectives to deprecate the custom dockerfile.
54-55
: LGTM! Clear guidance for ONNX PTQ users.The reference to the onnx_ptq Dockerfile and mention of the latest TensorRT version provides helpful context for users who need ONNX PTQ functionality.
.gitlab/tests.yml (6)
1-4
: LGTM! Proper addition of PIP_CONSTRAINT variable.The addition of the
PIP_CONSTRAINT: ""
variable to disable pip constraints aligns with the upgrade-focused workflow mentioned in the PR changes.
6-8
: LGTM! Simplified rules configuration.The simplified rules structure is cleaner and maintains the same functionality - scheduled pipelines run automatically, others are manual.
27-29
: LGTM! Job rename and timeout adjustment.The rename from
gpu
tomulti-gpu
is more descriptive, and the increased timeout from 60m to 90m provides better reliability for multi-GPU tests.
44-48
: LGTM! Migration to TensorRT-LLM docker image.The switch to
nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
aligns perfectly with the PR objectives to use TensorRT-LLM docker instead of custom ModelOpt docker. The environment variables properly set up TensorRT paths.
55-60
: LGTM! Sophisticated failure handling logic.The new rules configuration provides granular control over test failures - allowing failures only for specific examples when scheduled, which is a significant improvement over the previous blanket
allow_failure
approach.
50-50
: Verify EXAMPLE_ALLOWED_FAILURES is defined or fix the substitution.gitlab/tests.yml (≈line 50): ALLOWED_FAILURES: ${EXAMPLE_ALLOWED_FAILURES:llm_qat}
- Repo search returned no matches for EXAMPLE_ALLOWED_FAILURES — either add this key to your GitLab CI/CD variables (project/group/instance) or define a default in the repo.
- The expression
${EXAMPLE_ALLOWED_FAILURES:llm_qat}
looks incorrect for a default and may not perform bash-style fallback in GitLab/YAML; move default logic into the job script or predefine the variable. (devgem.io)
fabbe02
to
70c4652
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #346 +/- ##
=======================================
Coverage 73.83% 73.84%
=======================================
Files 172 172
Lines 17453 17453
=======================================
+ Hits 12887 12888 +1
+ Misses 4566 4565 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
70c4652
to
613df8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/source/getting_started/_installation_for_Linux.rst (1)
43-44
: Add aarch64 (ARM64) TensorRT env var examples or mark the snippet as x86_64-only.The TensorRT installation script detects architecture with "if [ "$ARCH" = "aarch64" ];then ARCH="sbsa";fi", indicating that aarch64 systems use "sbsa" in TensorRT paths. The documentation shows x86_64-only paths but should either include aarch64 equivalents or explicitly state it's x86_64-only.
Apply this diff to add aarch64 examples:
You would also need to setup appropriate environment variables for the TensorRT binaries as follows: .. code-block:: shell + # For x86_64 systems: export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + + # For aarch64 systems: + export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)
💤 Files with no reviewable changes (4)
- docker/README.md
- .github/CODEOWNERS
- docker/build.sh
- docker/Dockerfile
✅ Files skipped from review due to trivial changes (1)
- examples/llm_ptq/README.md
🚧 Files skipped from review as they are similar to previous changes (7)
- CONTRIBUTING.md
- examples/diffusers/README.md
- examples/llm_qat/README.md
- CHANGELOG.rst
- examples/onnx_ptq/README.md
- tests/examples/README.md
- README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (6)
docs/source/getting_started/_installation_for_Linux.rst (2)
33-37
: LGTM! Clear migration from custom to TensorRT-LLM Docker.The documentation effectively communicates the shift from ModelOpt's custom Docker to the standardized TensorRT-LLM image, which aligns with the PR objectives.
54-55
: ONNX PTQ Dockerfile reference verified
examples/onnx_ptq/docker/Dockerfile is present; no changes needed..gitlab/tests.yml (4)
3-4
: LGTM! Proper pip constraint handling.Adding
PIP_CONSTRAINT: ""
allows package upgrades without constraint conflicts, which is appropriate for testing the latest ModelOpt version.
55-57
: Environment variables properly configured for GitLab CI.The environment setup correctly mirrors the GitHub workflow changes, ensuring consistent TensorRT paths across both CI systems.
65-71
: Good isolation of failing tests.Moving
llm_qat
to a separateallow_failure: true
job prevents it from blocking the main pipeline while tracking the issue for future resolution.
28-28
: Job renamed appropriately for clarity.The rename from
gpu
tomulti-gpu
better reflects the job's purpose of running tests on 2-GPU configurations.
613df8d
to
39b57bc
Compare
aa9108b
to
e8b201f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
docs/source/getting_started/_installation_for_Linux.rst (1)
39-44
: Architecture-specific environment variables need clarification.This is a known issue from previous reviews. The environment variable examples are x86_64-specific, but the system requirements table (line 13) indicates support for both x86_64 and aarch64 (SBSA) architectures.
🧹 Nitpick comments (2)
docs/source/getting_started/_installation_for_Linux.rst (1)
89-92
: Verify consistency with docker deprecation strategy.The text still references "ModelOpt's docker image" (line 89) and suggests building it, which contradicts the PR's objective of deprecating the custom ModelOpt dockerfile. This language should be updated to reflect that the custom dockerfile is no longer recommended.
Based on the search results, I can see that TensorRT-LLM docker images do include ModelOpt as a dependency. The text should be updated to reflect the deprecation of the custom ModelOpt dockerfile.Apply this diff to update the text:
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all -optional dependencies pre-installed. -If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies. +If you use the recommended TensorRT-LLM docker image or other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies..gitlab/tests.yml (1)
8-9
: Tag-based rule may need semantic versioning validation.The regex pattern
^\d+\.\d+\.\d+$
matches basic semantic versioning but doesn't account for pre-release tags (likerc
,alpha
,beta
) that might be common in this project.Consider using a more comprehensive regex pattern that includes pre-release identifiers:
- - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ + - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+(-[a-zA-Z0-9\.\-]+)?$/
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)tests/examples/llm_qat/test_llm_qat.py
(3 hunks)
💤 Files with no reviewable changes (4)
- .github/CODEOWNERS
- docker/build.sh
- docker/README.md
- docker/Dockerfile
🚧 Files skipped from review as they are similar to previous changes (9)
- examples/llm_ptq/README.md
- CHANGELOG.rst
- CONTRIBUTING.md
- examples/diffusers/README.md
- examples/onnx_ptq/README.md
- README.md
- tests/examples/README.md
- examples/llm_qat/README.md
- .github/workflows/example_tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (12)
docs/source/getting_started/_installation_for_Linux.rst (3)
33-37
: Clear documentation improvement aligning with PR objectives.The updated text clearly guides users toward the TensorRT-LLM docker image as the recommended approach, which aligns perfectly with the PR's goal of deprecating the custom ModelOpt dockerfile. The guidance to upgrade Model Optimizer if needed is practical and helpful.
48-48
: Minor wording improvement enhances readability.The change from "Using alternative NVIDIA docker images" to "Alternative NVIDIA docker images" removes unnecessary verbosity and improves the section header clarity.
54-55
: ONNX PTQ guidance provides valuable context.The updated text effectively explains the benefit of using the onnx_ptq Dockerfile - access to the latest TensorRT version with cutting-edge features. This helps users understand why they might choose this option over alternatives.
tests/examples/llm_qat/test_llm_qat.py (4)
39-45
: Backend test coverage is properly adjusted for CI stability.The
xfail
markings align with the stable v1.1.0rc2.post2 release while maintaining coverage for working backends (fsdp2, deepspeed). This prevents CI failures while preserving test value for stable backends.
71-76
: Consistent backend parameterization applied correctly.The same backend handling approach is consistently applied across both QAT test variants, maintaining test coverage while acknowledging known issues.
97-97
: Explicit backend specification improves test reliability.Adding the
--backend fsdp2
parameter ensures the test runs with a stable backend configuration, improving test predictability.
102-102
: Test failure handling improved with xfail approach.Changing from
skip
toxfail
is beneficial as it still executes the test and captures the failure, providing better visibility into the actual issue versus simply skipping it..gitlab/tests.yml (5)
59-59
: Installation command includes correct package extras.The
pip install ".[all,dev-test]"
command properly installs the package with all required extras for testing, which is appropriate for the CI environment.
1-1
: Documentation reference verified — no action required.
All three referenced workflows (.github/workflows/unit_tests.yml, .github/workflows/gpu_tests.yml, .github/workflows/example_tests.yml) exist and include the consistency note referencing .gitlab/tests.yml.
3-4
: Verify and justify disabling pip constraints in CIPIP_CONSTRAINT: "" disables pip constraints — this risks non-reproducible installs, dependency conflicts, and pulling unvetted/vulnerable package versions.
- Location: .gitlab/tests.yml (lines 3–4).
- If accidental: revert to the repository constraints value.
- If intentional: limit the change to a single job or guard it with an explicit opt‑in env var and document why.
- Before merging: run full CI, reproduce installs in a clean environment, run dependency checks (pip check / pip-tools / pipdeptree) and a security audit (pip-audit or SCA) to confirm no resolution errors or CVEs.
46-46
: Verify TensorRT-LLM Docker image availability and compatibility.gitlab/tests.yml:46 — image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
Sandbox run failed (docker not available; manifest inspect failed). Manually verify the image is pullable and contains the runtime/tools required for ModelOpt tests.
Run locally:
docker manifest inspect nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 docker run --rm nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 bash -c "python --version; ls -la /usr/local/tensorrt/targets/x86_64-linux-gnu/bin | head -5; echo \$LD_LIBRARY_PATH"
Checks: manifest returns success; Python present; TensorRT bin directory contains expected tools; LD_LIBRARY_PATH includes TensorRT libraries.
55-57
: Verify TensorRT library paths exist in the CI/containerVerification failed: "Cannot verify paths (image may not be available locally)". Ensure the CI runner pulls/uses the expected image and verify these paths exist at runtime.
File: .gitlab/tests.yml Lines: 55-57
- export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" # Add trtexec to path - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"
Run locally or in CI to check:
docker pull nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 docker run --rm nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 bash -c "\ ls -la /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/ 2>/dev/null || echo 'TensorRT lib path not found'; \ [ -x /usr/local/tensorrt/targets/x86_64-linux-gnu/bin/trtexec ] && echo 'trtexec found' || echo 'trtexec not found'; \ ls -la /usr/lib/x86_64-linux-gnu/libcudnn* 2>/dev/null || echo 'cudnn not found' \ "
If the paths/tools are absent, update the job to pull the correct image, make the exports conditional (check for dirs/files before exporting), or detect the correct runtime paths and update .gitlab/tests.yml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
.gitlab/tests.yml (4)
3-4
: Don’t globally disable pip constraints.Setting PIP_CONSTRAINT="" at the template level weakens reproducibility for all jobs and can unintentionally upgrade preinstalled libs in vendor images (e.g., TRT‑LLM), risking ABI mismatches.
Apply this to scope the behavior only where needed (or drop it entirely if accidental):
.tests-default: - variables: - PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages + variables: {} @@ example: extends: .tests-default + variables: + # If you intentionally want to bypass constraints only for examples, keep it scoped here. + PIP_CONSTRAINT: ""If you do have a constraints file, prefer pinning instead:
- - pip install ".[all,dev-test]" + - pip install -c tests/constraints.txt ".[all,dev-test]"
36-37
: Export LD_LIBRARY_PATH earlier and guard for missing paths.Doing this inside script works, but guarding avoids polluting LD_LIBRARY_PATH with non‑existent dirs and improves portability across image variants.
multi-gpu: @@ - script: - # Add libcudnn*.so and libnv*.so to path - - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + before_script: + # Add libcudnn*.so and libnv*.so to path (guard if dirs exist) + - for d in /usr/lib/x86_64-linux-gnu /usr/local/tensorrt/targets/x86_64-linux-gnu/lib; do + [ -d "$d" ] && export LD_LIBRARY_PATH="$d:${LD_LIBRARY_PATH}"; + done script:
46-47
: Pin or parameterize the TRT‑LLM image; avoid RC drift.Using an RC tag (1.1.0rc2.post2) can change over time or be removed. Prefer a digest, or at least parameterize via a variable to centralize updates.
example: extends: .tests-default stage: tests timeout: 30m - image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 + variables: + TRT_LLM_IMAGE: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 + image: $TRT_LLM_IMAGEIf possible, pin by digest for reproducibility:
image: nvcr.io/nvidia/tensorrt-llm/release@sha256:<digest>Also ensure the CI runner has pull access to nvcr.io for this repo.
55-58
: Harden path exports; avoid adding non‑existent dirs.Minor portability tweak to keep envs clean across images.
before_script: - # Add libcudnn*.so and libnv*.so to path - - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - # Add trtexec to path - - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH" + # Add libcudnn*.so and libnv*.so to path + - for d in /usr/lib/x86_64-linux-gnu /usr/local/tensorrt/targets/x86_64-linux-gnu/lib; do + [ -d "$d" ] && export LD_LIBRARY_PATH="$d:${LD_LIBRARY_PATH}"; + done + # Add trtexec to path + - if [ -d /usr/local/tensorrt/targets/x86_64-linux-gnu/bin ]; then export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"; fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)tests/examples/llm_qat/test_llm_qat.py
(3 hunks)
💤 Files with no reviewable changes (4)
- docker/build.sh
- docker/Dockerfile
- .github/CODEOWNERS
- docker/README.md
🚧 Files skipped from review as they are similar to previous changes (11)
- CONTRIBUTING.md
- tests/examples/llm_qat/test_llm_qat.py
- examples/diffusers/README.md
- examples/llm_ptq/README.md
- README.md
- .github/workflows/example_tests.yml
- examples/llm_qat/README.md
- examples/onnx_ptq/README.md
- docs/source/getting_started/_installation_for_Linux.rst
- tests/examples/README.md
- CHANGELOG.rst
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (2)
.gitlab/tests.yml (2)
1-1
: Good cross‑repo consistency note.Keeping CI config aligned with GitHub workflows is helpful. Please maintain this in future edits.
28-28
: Job rename to “multi-gpu” makes intent clearer.Name better reflects resource use. No issues spotted.
e8b201f
to
7fd1f74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/examples/llm_qat/test_llm_qat.py (3)
97-98
: Pinned fsdp2 for LoRA QAT: confirm intent; consider parity with other tests.If stability permits, consider parameterizing over runnable backends to keep parity with other QAT tests; otherwise keep fsdp2 pinned.
-def test_llama_lora_qat_nvfp4(tiny_llama_path, tmp_path): +@pytest.mark.parametrize("backend", ["fsdp2", "deepspeed"]) +def test_llama_lora_qat_nvfp4(tiny_llama_path, tmp_path, backend): @@ - "--output_dir", tmp_path / "lora_qat", - "--backend", "fsdp2", + "--output_dir", tmp_path / "lora_qat", + "--backend", backend,
102-103
: Make xfail non-strict and reference a tracking issue.Prevents CI failures on unexpected pass if project-wide strict xfail is enabled and adds traceability.
-@pytest.mark.xfail(reason="FIXME: QLoRa test failure") +@pytest.mark.xfail(strict=False, reason="FIXME: QLoRa test failure; see #<issue>")
41-45
: Consolidate backend parametrization and add tracker IDsSkips for fsdp1/ddp are appropriate given hangs; align PR text (allowed-to-fail vs skip), add tracker issue IDs in skip reasons, and de-duplicate the backend list into a shared BACKENDS constant reused by both parametrize decorators. Verified examples/llm_qat/launch.sh accepts BACKEND values: fsdp1, fsdp2, ddp, deepspeed.
@@ -import pytest +import pytest @@ -@pytest.mark.parametrize("backend", [ - pytest.param("fsdp1", marks=pytest.mark.skip(reason="FIXME: fsdp1 test hangs")), - "fsdp2", - "deepspeed", - pytest.param("ddp", marks=pytest.mark.skip(reason="FIXME: ddp test hangs")), -]) +BACKENDS = [ + pytest.param("fsdp1", marks=pytest.mark.skip(reason="FIXME: fsdp1 test hangs; see #<issue>"), id="fsdp1"), + pytest.param("fsdp2", id="fsdp2"), + pytest.param("deepspeed", id="deepspeed"), + pytest.param("ddp", marks=pytest.mark.skip(reason="FIXME: ddp test hangs; see #<issue>"), id="ddp"), +] +@pytest.mark.parametrize("backend", BACKENDS) def test_llama_qat_int4w_int8a(tiny_llama_path, tmp_path, backend): @@ -@pytest.mark.parametrize("backend", [ - pytest.param("fsdp1", marks=pytest.mark.skip(reason="FIXME: fsdp1 test hangs")), - "fsdp2", - "deepspeed", - pytest.param("ddp", marks=pytest.mark.skip(reason="FIXME: ddp test hangs")), -]) +@pytest.mark.parametrize("backend", BACKENDS) def test_llama_qat_int4w_int8a_direct_qat(tiny_llama_path, tmp_path, backend):
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)tests/examples/llm_qat/test_llm_qat.py
(3 hunks)
💤 Files with no reviewable changes (4)
- docker/build.sh
- .github/CODEOWNERS
- docker/Dockerfile
- docker/README.md
✅ Files skipped from review due to trivial changes (1)
- README.md
🚧 Files skipped from review as they are similar to previous changes (10)
- examples/diffusers/README.md
- examples/llm_ptq/README.md
- examples/llm_qat/README.md
- CONTRIBUTING.md
- CHANGELOG.rst
- examples/onnx_ptq/README.md
- .gitlab/tests.yml
- tests/examples/README.md
- .github/workflows/example_tests.yml
- docs/source/getting_started/_installation_for_Linux.rst
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (1)
tests/examples/llm_qat/test_llm_qat.py (1)
39-39
: Whitespace-only change.No action needed.
7fd1f74
to
41d36ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/source/getting_started/_installation_for_Linux.rst (1)
89-93
: Stale reference to ModelOpt’s custom docker image.This PR deprecates the custom ModelOpt image; these lines still refer to building/using it. Replace with TRT‑LLM image guidance.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all -optional dependencies pre-installed. -If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies. +If you use the recommended TensorRT‑LLM docker image, ModelOpt is pre-installed with many optional dependencies. +For other suggested NVIDIA images, ModelOpt may be pre-installed with some optional dependencies.
♻️ Duplicate comments (3)
docs/source/getting_started/_installation_for_Linux.rst (1)
43-44
: Add aarch64 equivalents or mark snippet as x86_64-only.The env exports are x86_64-specific. Add aarch64 variants or a note that these lines target x86_64 only.
Apply one of the following:
.. code-block:: shell - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # x86_64 + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # aarch64 (SBSA) + # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}".gitlab/tests.yml (2)
58-58
: Avoid unconstrained extras installs in vendor images.This can upgrade torch/transformers and break CUDA/TRT compat. Use constraints or no-deps; install example reqs first.
- - pip install ".[all,dev-test]" + # Prefer pinned installs to avoid clobbering preinstalled CUDA/TRT stack + - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -c "$PIP_CONSTRAINT" -r "$req_file" || exit 1; done + - pip install -c "$PIP_CONSTRAINT" ".[dev-test]" # or: pip install --no-deps ".[dev-test]"
6-9
: Unconditional “when: manual” disables default MR/branch pipelines.Make on_success the default; keep manual only for explicit sources.
rules: - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ - - when: manual + when: on_success + - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ # release tags + when: on_success + - if: $CI_PIPELINE_SOURCE == "web" # allow manual web runs + when: manual + - when: on_success # default for branches/MRs
🧹 Nitpick comments (7)
docs/source/getting_started/_installation_for_Linux.rst (3)
33-36
: Pin a known-good TRT-LLM tag in the example.Suggest showing a concrete, tested tag to avoid drift with deps in docs and CI.
- e.g., ``nvcr.io/nvidia/tensorrt-llm/release:<version>``. + e.g., ``nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2`` (update as releases advance).
37-37
: Grammar nit.“if not already” → “if not already installed” or “if not already up to date”.
- You may upgrade the Model Optimizer to the latest version if not already as described in the next section. + You may upgrade Model Optimizer to the latest version if not already installed, as described in the next section.
46-46
: Typo: possessive.“examples's” → “example’s” (or “examples’” if plural).
- You may need to install additional dependencies from the respective examples's `requirements.txt` file. + You may need to install additional dependencies from the respective example’s `requirements.txt` file..gitlab/tests.yml (4)
3-4
: Reconsider disabling pip constraints globally.Blank PIP_CONSTRAINT invites uncontrolled upgrades inside vendor images; prefer constraints aligned to image baselines.
.tests-default: variables: - PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages + PIP_CONSTRAINT: "tests/constraints/trt-llm-1.1.0rc2.post2.txt" # Pin to image baseline; update per imageIf no constraints file exists, generate and check one in per image/tag.
35-39
: Arch-specific TensorRT paths; guard for aarch64 or absent installs.Make the exports conditional on arch and directory existence to avoid broken PATH/LD paths on ARM runners or non‑TRT images.
- # Add libcudnn*.so and libnv*.so to path - - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - # Add trtexec to path - - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH" + # Add TensorRT libs/tools if present (x86_64/aarch64) + - ARCH="$(uname -m)"; [ "$ARCH" = "aarch64" ] && GNU_ARCH="aarch64-linux-gnu" || GNU_ARCH="x86_64-linux-gnu" + - TRT_ROOT="/usr/local/tensorrt/targets/$GNU_ARCH" + - if [ -d "$TRT_ROOT/lib" ]; then export LD_LIBRARY_PATH="/usr/lib/$GNU_ARCH:$TRT_ROOT/lib:${LD_LIBRARY_PATH}"; fi + - if [ -d "$TRT_ROOT/bin" ]; then export PATH="$TRT_ROOT/bin:$PATH"; fi
31-39
: Consider standardizing GPU images on TRT‑LLM.Given the project shift, switching .gpu-tests-default to the TRT‑LLM image can reduce env drift and simplify TRT tooling setup.
- image: nvcr.io/nvidia/pytorch:25.06-py3 + image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2If Megatron/torch baselines require the PyTorch image, document the exception.
67-67
: Parameterize TRT‑LLM image tag.Keep docs/CI in sync and ease upgrades.
- image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 + image: nvcr.io/nvidia/tensorrt-llm/release:${TRT_LLM_TAG:-1.1.0rc2.post2}Define TRT_LLM_TAG at pipeline level to bump centrally.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)tests/examples/llm_qat/test_llm_qat.py
(3 hunks)
💤 Files with no reviewable changes (4)
- docker/build.sh
- docker/README.md
- .github/CODEOWNERS
- docker/Dockerfile
🚧 Files skipped from review as they are similar to previous changes (10)
- examples/llm_qat/README.md
- CHANGELOG.rst
- examples/diffusers/README.md
- README.md
- CONTRIBUTING.md
- examples/llm_ptq/README.md
- tests/examples/llm_qat/test_llm_qat.py
- examples/onnx_ptq/README.md
- tests/examples/README.md
- .github/workflows/example_tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
41d36ff
to
96b897f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.gitlab/tests.yml (1)
58-64
: Skip fsdp1 paths for llm_qat in CI (prevent blocking hangs)EXAMPLE matrix in .gitlab/tests.yml still includes llm_qat; no pytest skip/xfail for fsdp/fsdp1 was found in tests/examples or tests, and tests/examples/test_llm_qat.sh is missing. Add skip/xfail for fsdp1 paths or remove llm_qat from the matrix to avoid CI hangs.
♻️ Duplicate comments (2)
.gitlab/tests.yml (2)
6-9
: Fix rules: current config makes most pipelines manual-only.Unconditional "when: manual" at Line 9 forces branch/MR runs to be manual. Make auto the default and keep a separate manual path for web-triggered runs.
Apply this diff:
.tests-default: rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ - - when: manual + - if: $CI_PIPELINE_SOURCE == "schedule" + when: on_success + - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ # release tags + when: on_success + - if: $CI_PIPELINE_SOURCE == "web" # allow manual reruns from web + when: manual + - when: on_success # default for other pipelines
3-4
: Avoid unconstrained installs on vendor images; wire PIP_CONSTRAINT and reorder installs.Unpinned extras can upgrade core libs inside TRT‑LLM/NGC images. Pass an optional constraints file and install example requirements first.
Apply this diff:
.tests-default: variables: - PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages + PIP_CONSTRAINT: "" # If set, passed to pip (-c "$PIP_CONSTRAINT") to avoid upgrading vendor stack @@ example: extends: .gpu-tests-default timeout: 30m @@ script: - - pip install ".[all,dev-test]" + # Install example-specific deps first, under constraints if provided + - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install ${PIP_CONSTRAINT:+-c "$PIP_CONSTRAINT"} -r "$req_file" || exit 1; done + # Then install project dev/test deps without clobbering CUDA/TensorRT; avoid "all" extras + - pip install ${PIP_CONSTRAINT:+-c "$PIP_CONSTRAINT"} ".[dev-test]" @@ - - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -r "$req_file" || exit 1; done + # (moved above)Also applies to: 60-64
🧹 Nitpick comments (1)
.gitlab/tests.yml (1)
35-41
: Guard TensorRT PATH exports or move them to TRT‑LLM images.These paths may not exist in nvcr.io/nvidia/pytorch:25.06‑py3. No harm exporting, but it’s cleaner to gate them.
Apply this diff:
before_script: - # Add libcudnn*.so and libnv*.so to path - - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - # Add trtexec to path - - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH" + # Add TensorRT libs/tools only if present in the image + - if [ -d "/usr/local/tensorrt/targets/x86_64-linux-gnu" ]; then \ + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"; \ + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"; \ + fi # Install git-lfs for Daring-Anteater dataset - apt-get update && apt-get install -y git-lfs
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(3 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)tests/examples/llm_qat/test_llm_qat.py
(3 hunks)
💤 Files with no reviewable changes (4)
- docker/Dockerfile
- docker/README.md
- .github/CODEOWNERS
- docker/build.sh
✅ Files skipped from review due to trivial changes (2)
- examples/diffusers/README.md
- examples/onnx_ptq/README.md
🚧 Files skipped from review as they are similar to previous changes (9)
- CONTRIBUTING.md
- README.md
- .github/workflows/example_tests.yml
- docs/source/getting_started/_installation_for_Linux.rst
- tests/examples/llm_qat/test_llm_qat.py
- tests/examples/README.md
- CHANGELOG.rst
- examples/llm_ptq/README.md
- examples/llm_qat/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (1)
.gitlab/tests.yml (1)
52-54
: Switch example job to TRT‑LLM image (nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc3) and verify CI can pull itexample currently inherits nvcr.io/nvidia/pytorch:25.06‑py3 from .gpu-tests-default; replace with the TRT‑LLM image and prefer 1.1.0rc3 (NGC shows rc3 is newer than rc2.post2) or wait for an official 1.1.x stable. Confirm CI runners have access/credentials to pull nvcr.io images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
.gitlab/tests.yml (2)
52-66
: Consider constraining pip installs to avoid TRT-LLM compatibility issues.Installing unconstrained
".[all,dev-test]"
can upgrade core libraries (torch/transformers) in vendor TRT‑LLM images and cause CUDA/TensorRT incompatibilities.Options to mitigate:
- Use constraints file:
pip install -c constraints.txt ".[all,dev-test]"
- Use
--no-deps
if extras only add tooling- Install per-example requirements before extras to minimize upgrades
Example approach:
- - pip install ".[all,dev-test]" + # Install with constraints to avoid clobbering preinstalled CUDA/TensorRT stack + - pip install -c tests/constraints.txt ".[all,dev-test]"
8-9
: Fix .tests-default rules so jobs run by default (remove catch‑all manual)The unconditional
- when: manual
rule as the last entry makes all non-schedule/non-tag pipelines (branches/MRs) manual by default. This defeats automatic CI execution on branch pushes and merge requests.Apply this diff to fix the rules order:
.tests-default: rules: - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ - - when: manual + when: on_success + - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ # release tags + when: on_success + - if: $CI_PIPELINE_SOURCE == "web" # allow manual reruns from web + when: manual + - when: on_success # default for all other pipelines
🧹 Nitpick comments (2)
examples/llm_sparsity/data_prep.py (2)
59-62
: Write with explicit UTF‑8; consider dropping pretty‑print and/or JSONL streaming for scale.
- Since ensure_ascii=False, default locale encodings can corrupt output on some systems; open files with encoding="utf-8".
- CNN/DailyMail splits are large; indent=4 bloats files and I/O. Prefer compact JSON or JSONL streaming.
Minimal patch:
-with open(os.path.join(args.save_path, "cnn_train.json"), "w") as write_f: - json.dump(list(tokenized_dataset["train"]["text"]), write_f, indent=4, ensure_ascii=False) +with open(os.path.join(args.save_path, "cnn_train.json"), "w", encoding="utf-8") as write_f: + json.dump(list(tokenized_dataset["train"]["text"]), write_f, ensure_ascii=False) with open(os.path.join(args.save_path, "cnn_eval.json"), "w") as write_f: - json.dump(list(tokenized_dataset["test"]["text"]), write_f, indent=4, ensure_ascii=False) + json.dump(list(tokenized_dataset["test"]["text"]), write_f, ensure_ascii=False)Optional JSONL streaming (memory-lean, faster diffs):
with open(os.path.join(args.save_path, "cnn_train.jsonl"), "w", encoding="utf-8") as f: for row in tokenized_dataset["train"]: f.write(json.dumps(row["text"], ensure_ascii=False) + "\n") with open(os.path.join(args.save_path, "cnn_test.jsonl"), "w", encoding="utf-8") as f: for row in tokenized_dataset["test"]: f.write(json.dumps(row["text"], ensure_ascii=False) + "\n")
61-62
: Naming nit: “eval” file is built from the test split.Either rename to cnn_test.json or switch to the validation split for eval artifacts to avoid confusion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/CODEOWNERS
(0 hunks).github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(2 hunks)CHANGELOG.rst
(1 hunks)CONTRIBUTING.md
(1 hunks)README.md
(2 hunks)docker/Dockerfile
(0 hunks)docker/README.md
(0 hunks)docker/build.sh
(0 hunks)docs/source/getting_started/_installation_for_Linux.rst
(1 hunks)examples/diffusers/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/llm_sparsity/data_prep.py
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)tests/examples/README.md
(1 hunks)tests/examples/llm_sparsity/test_llama_sparsify.py
(0 hunks)
💤 Files with no reviewable changes (5)
- docker/README.md
- .github/CODEOWNERS
- docker/build.sh
- docker/Dockerfile
- tests/examples/llm_sparsity/test_llama_sparsify.py
✅ Files skipped from review due to trivial changes (2)
- CHANGELOG.rst
- examples/onnx_ptq/README.md
🚧 Files skipped from review as they are similar to previous changes (8)
- examples/llm_qat/README.md
- CONTRIBUTING.md
- examples/llm_ptq/README.md
- tests/examples/README.md
- examples/diffusers/README.md
- .github/workflows/example_tests.yml
- docs/source/getting_started/_installation_for_Linux.rst
- README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (6)
examples/llm_sparsity/data_prep.py (1)
60-60
: Good fix: ensure JSON-serializable column.Wrapping the datasets column with list(...) avoids pyarrow/Datasets column types leaking into json.dump. Looks correct.
Also applies to: 62-62
.gitlab/tests.yml (5)
3-4
: LGTM! Environment setup looks correct.The
PIP_CONSTRAINT: ""
variable properly disables pip constraints to allow package upgrades, which aligns with the PR objective of switching to TRT-LLM Docker images where unconstrained installs may be necessary.
28-43
: Good consolidation of GPU test configuration.The refactoring to
.gpu-tests-default
with proper library path setup and git-lfs installation is well-structured and consolidates common GPU test requirements.
44-49
: LGTM! Clean inheritance pattern.The
multi-gpu
job properly extends the new.gpu-tests-default
base, maintaining consistency with the refactored structure.
67-74
: Good approach to isolate failing tests.The
example-failing
job withallow_failure: true
forllm_qat
tests is a pragmatic solution while the hanging issues are being investigated. The separate job prevents these failures from blocking the main pipeline.
78-78
: Verify TensorRT-LLM image version availability.Couldn't confirm automatically — your script's NGC API call returned non-JSON and
docker manifest inspect
failed (parse error; "Image not accessible or doesn't exist"). Action: confirmnvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
is published and pullable from NGC (log in to nvcr.io / use the NGC web UI or ngc CLI, thendocker pull
), or update the CI image to a known published tag (GitHub shows a v1.1.0rc2 release). (github.com)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It would also be great if you could replace the default docker with the onnx ptq docker for ONNX unit and integration tests.
Signed-off-by: Keval Morabia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
docs/source/getting_started/_installation_for_Linux.rst (1)
43-46
: Fix env vars: drop /usr/include and show aarch64 variant
- Don’t add /usr/include to LD_LIBRARY_PATH.
- Provide both x86_64 and aarch64 paths (this was requested previously).
Apply this diff:
- export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # x86_64 + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # aarch64 (SBSA) + # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}".gitlab/tests.yml (3)
7-9
: Rules force jobs to manual by default; make branch/MR pipelines autoThe unconditional “- when: manual” makes non-schedule, non-tag jobs manual. Use an explicit on_success default and a gated manual rule.
rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ - - when: manual + - if: $CI_PIPELINE_SOURCE == "schedule" + when: on_success + - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ + when: on_success + - if: $CI_PIPELINE_SOURCE == "web" # allow manual reruns from web UI + when: manual + - when: on_success # default for branches/MRs
59-60
: Avoid unconstrainedpip install ".[all,dev-test]"
on vendor imagesThis can clobber preinstalled CUDA/TRT stacks.
Safer patterns:
- Install example requirements first, then extras with constraints:
- - pip install ".[all,dev-test]" + - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -r "$req_file" || exit 1; done + - pip install -c "$PIP_CONSTRAINT" ".[dev-test]"
- If extras are tooling-only:
pip install --no-deps ".[dev-test]"
.Please confirm which path you prefer; I can prep a constraints file.
Also applies to: 93-95
3-4
: Unconstrained upgrades increase breakage risk in vendor imagesSetting PIP_CONSTRAINT to empty combined with later installs of ".[all,dev-test]" can upgrade torch/transformers inside TRT‑LLM/TensorRT images and cause CUDA/TRT ABI mismatches. Prefer pinned installs.
Option A (recommended): restore constraints per image baseline.
variables: - PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages + PIP_CONSTRAINT: "tests/constraints/trtllm-1.1.0rc2.txt"Option B: when extras are tooling-only, use --no-deps and install per-example requirements first.
I can help derive a constraints file from your current images.
🧹 Nitpick comments (5)
examples/onnx_ptq/README.md (1)
27-27
: Docs link change LGTM; minor cross-link nitConsider also mentioning the specific NGC images (TensorRT-LLM and TensorRT) here to mirror the Linux install guide’s recommendations for ONNX/TensorRT workflows. Optional only.
docs/source/getting_started/_installation_for_Linux.rst (3)
33-35
: Good shift to TRT-LLM image; add a concrete tag exampleAdd one explicit, known-good tag example (e.g., release:1.1.0rc2.post2) alongside the placeholder to reduce guesswork.
Also applies to: 37-41
46-46
: Grammar nit“examples's” → “example’s”.
79-81
: Wording clarityTighten phrasing to: “You may still use other ModelOpt features locally (e.g., quantize a Hugging Face model) and use a docker image for deployment.”
.gitlab/tests.yml (1)
35-43
: Fix TensorRT env exports: remove /usr/include and add aarch64 altMirror the docs fix here to avoid polluting LD_LIBRARY_PATH and to support aarch64 runners.
- - export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # For aarch64 runners: + # - export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + # - export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/example_tests.yml
(1 hunks).gitlab/tests.yml
(2 hunks)CHANGELOG.rst
(1 hunks)docs/source/getting_started/_installation_for_Linux.rst
(2 hunks)examples/llm_autodeploy/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)examples/onnx_ptq/docker/Dockerfile
(0 hunks)examples/onnx_ptq/docker/build.sh
(0 hunks)
💤 Files with no reviewable changes (2)
- examples/onnx_ptq/docker/Dockerfile
- examples/onnx_ptq/docker/build.sh
✅ Files skipped from review due to trivial changes (2)
- examples/llm_autodeploy/README.md
- CHANGELOG.rst
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/example_tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
docs/source/getting_started/_installation_for_Linux.rst (1)
39-44
: Fix LD_LIBRARY_PATH content and provide arch‑aware, container‑aware exports.
- LD_LIBRARY_PATH must not include /usr/include.
- Snippet is x86_64‑only; please include aarch64.
- In official NGC containers these paths are usually set; call out that the exports are primarily for bare‑metal or tarball installs.
Apply:
- You would also need to setup appropriate environment variables for the TensorRT binaries as follows: + If you installed TensorRT from a tarball on bare‑metal, set the environment variables for the binaries/libraries as follows. + In official NGC containers, these are typically preconfigured and this step can be skipped. @@ - export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # x86_64 + if [ "$(uname -m)" = "x86_64" ]; then + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + fi + # aarch64 + if [ "$(uname -m)" = "aarch64" ]; then + export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}" + fi
🧹 Nitpick comments (6)
docs/source/getting_started/_installation_for_Linux.rst (6)
46-46
: Grammar: fix possessive.Change “examples's” to “example’s” (or “examples’” if plural is intended).
- You may need to install additional dependencies from the respective examples's `requirements.txt` file. + You may need to install additional dependencies from the respective example’s ``requirements.txt`` file.
48-48
: Add NGC login tip for pulling containers.Many users hit auth errors on nvcr.io. Add a one‑liner to login.
- **Alternative NVIDIA docker images** + **Alternative NVIDIA docker images** + + .. tip:: + Authenticate to NGC before pulling images: ``docker login nvcr.io`` (requires an NVIDIA API key).
54-55
: Avoid unqualified performance claim.“provides superior performance” is broad and context‑dependent. Rephrase to reflect scope.
- which provides superior performance to the PyTorch container. + which is optimized for TensorRT inference workflows and deployment.
79-80
: Tighten phrasing and possessives.Minor clarity/grammar improvements.
- issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's - features locally for example, quantizing a HuggingFace model and then use a docker image for deployment. + issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt + features locally—for example, quantizing a HuggingFace model—then use a docker image for deployment.
88-91
: Update text to reflect deprecation of the custom ModelOpt docker image.This still refers to “build and use ModelOpt’s docker image,” which was deprecated in this PR.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all -optional dependencies pre-installed. +If you use the recommended NVIDIA containers above, you can skip this step when the image already contains ModelOpt and +the optional dependencies you need.
10-26
: Unpin system requirements; update TensorRT‑LLM and clarify PyTorch minimumVerified: NGC TensorRT‑LLM latest tag is 1.1.0rc3 (modified Sep 4, 2025); NVIDIA TensorRT production branch containers are 25.03; NVIDIA PyTorch container builds vary (examples: 25.01 → PyTorch 2.6.x, 25.05/25.06 → 2.8); ModelOpt lists PyTorch (optional) >= 2.2 (not >= 2.6).
- Edit docs/source/getting_started/_installation_for_Linux.rst (lines 10–26): replace hard pins with “tested with” language and link an authoritative compatibility matrix / NGC container tags.
- Replace "TensorRT‑LLM (Optional) | 1.1.0rc2.post2" with either "tested with 1.1.x (NGC 1.1.0rc3)" or update the entry to the exact NGC tag 1.1.0rc3.
- Replace "PyTorch | >=2.6" with wording that distinguishes minimum vs tested versions, e.g. "PyTorch | >=2.2 (tested with 2.6–2.8 in NVIDIA containers)".
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
README.md
(2 hunks)docs/source/getting_started/_installation_for_Linux.rst
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: wait-checks / wait
- GitHub Check: linux
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (1)
docs/source/getting_started/_installation_for_Linux.rst (1)
33-38
: Add compatibility note about Model Optimizer in TRT‑LLM imagesPreinstalled nvidia-modelopt in TRT‑LLM images may lag; if upgrading with pip ensure the upgraded ModelOpt is ABI/API-compatible with the image's TRT‑LLM/TensorRT versions. Reference: TRT‑LLM v0.21.0 bundles TensorRT 10.11, PyTorch 2.7.1, CUDA 12.9.0; PyPI nvidia-modelopt==0.35.0 (uploaded Sep 4, 2025).
Apply this minimal addition after Line 37:
+ .. note:: + TRT‑LLM release images may contain an older ``nvidia-modelopt``. If you upgrade with ``pip``, + ensure the upgraded ModelOpt version is compatible with the TRT‑LLM/TensorRT versions in the image. + See the compatibility notes in the release docs.
Signed-off-by: Keval Morabia <[email protected]>
d39f268
to
3392955
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
examples/onnx_ptq/README.md (1)
27-30
: Recommended base TensorRT image now listed — resolves prior ask.This addresses the earlier request to name a base docker. Consider adding a short note that MXFP8/NVFP4 evaluation requires TensorRT >= 10.11 to help users pick a compatible tag.
-Please use the TensorRT docker image (e.g., `nvcr.io/nvidia/tensorrt:25.08-py3`) or visit our installation docs... +Please use the TensorRT docker image (e.g., `nvcr.io/nvidia/tensorrt:25.08-py3`) or visit our installation docs... +Note: For MXFP8 or NVFP4 evaluation in this README, choose a TensorRT container tag that includes TensorRT >= 10.11.docs/source/getting_started/_installation_for_Linux.rst (2)
33-38
: Good callout to upgrade ModelOpt in NGC images.This addresses prior feedback that release images can ship older ModelOpt.
43-46
: Fix LD_LIBRARY_PATH and add aarch64 variant.
- Don’t include /usr/include in LD_LIBRARY_PATH.
- Provide aarch64 paths (previous review asked for this).
- export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # x86_64 + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # aarch64 (SBSA) + # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
🧹 Nitpick comments (6)
examples/pruning/README.md (3)
26-26
: Add an explicit “pip -U nvidia-modelopt” note for NeMo container users.Containers may ship an older ModelOpt; mirror other docs by advising to upgrade inside the container.
-For Minitron pruning for Megatron-LM / NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.07`) which has all the dependencies installed. +For Minitron pruning for Megatron-LM / NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.07`) which has all the dependencies installed. After launching the container, upgrade Model Optimizer: +```bash +pip install -U nvidia-modelopt +```
30-30
: Typo: “requisred” → “required”.-For GradNAS pruning for Hugging Face BERT / GPT-J, no additional dependencies are requisred. +For GradNAS pruning for Hugging Face BERT / GPT-J, no additional dependencies are required.
115-116
: Grammar fixes in FastNAS section.-also how to profiling the model to understand the search space of possible pruning options and demonstrates -the usage saving and restoring pruned models. +also how to profile the model to understand the search space of possible pruning options and demonstrates +the usage of saving and restoring pruned models.examples/speculative_decoding/README.md (1)
46-47
: Avoid root requirement and system‑wide Git LFS install.Many users won’t have root in shared images. Prefer user‑level install; drop “--system”.
-apt-get update && apt-get install -y git-lfs -git lfs install --system +# If you have root: +sudo apt-get update && sudo apt-get install -y git-lfs +# User-level install (recommended): +git lfs installdocs/source/getting_started/_installation_for_Linux.rst (2)
88-90
: Remove reference to building ModelOpt’s custom docker image.This PR deprecates the custom Dockerfile; align wording to recommended NGC images.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all -optional dependencies pre-installed. -If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies. +If you use the recommended NVIDIA docker images (TensorRT‑LLM, PyTorch, or NeMo), ModelOpt may already be pre‑installed with some optional dependencies. Make sure to upgrade to the latest version of ModelOpt (with appropriate optional dependencies you need) using pip as shown below.
21-24
: Version pin: update TensorRT-LLM to 1.1.0rc3 or loosen to >=1.1.0rc3docs/source/getting_started/_installation_for_Linux.rst (lines 21–24) currently pins "TensorRT-LLM (Optional) 1.1.0rc2.post2"; latest NGC TensorRT-LLM is 1.1.0rc3 and recommended NGC base image tag is 25.03 — either change the pin to "1.1.0rc3" or use ">=1.1.0rc3" and document the NGC tag 25.03.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
README.md
(2 hunks)docs/source/getting_started/_installation_for_Linux.rst
(2 hunks)examples/diffusers/README.md
(2 hunks)examples/llm_distill/README.md
(1 hunks)examples/llm_ptq/README.md
(1 hunks)examples/llm_qat/README.md
(1 hunks)examples/onnx_ptq/README.md
(1 hunks)examples/pruning/README.md
(1 hunks)examples/speculative_decoding/README.md
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- README.md
- examples/llm_qat/README.md
- examples/diffusers/README.md
- examples/llm_ptq/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (5)
examples/llm_distill/README.md (2)
24-31
: Docker prereqs look good.Clear guidance for PyTorch/NeMo images plus link to install docs.
32-38
: Install command update LGTM.Switch to pip install -U nvidia-modelopt[hf] is appropriate.
examples/onnx_ptq/README.md (1)
36-36
: Install command update LGTM.examples/speculative_decoding/README.md (2)
28-33
: Docker prereqs + upgrade note LGTM.
39-40
: Local install command LGTM.
aa730a1
to
d22bdab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
.gitlab/tests.yml (3)
80-96
: Avoid unconstrained".[all,dev-test]"
on TRT base; add constraints and keep Apex uninstall.Repeat of prior concern; add constraints and guard path appends.
- - pip install ".[all,dev-test]" + - pip install -c ${PIP_CONSTRAINT:-tests/constraints/trt-25.08.txt} ".[dev-test]" + # Install example deps under constraints to minimize upgrades + - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -c ${PIP_CONSTRAINT:-tests/constraints/trt-25.08.txt} -r "$req_file" || exit 1; done
8-10
: Rules make pipelines manual by default; fix ordering/default.This was flagged earlier. The unconditional
- when: manual
forces branch/MR jobs to manual.Apply:
.tests-default: rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ - - when: manual + - if: $CI_PIPELINE_SOURCE == "schedule" + when: on_success + - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/ + when: on_success + - if: $CI_PIPELINE_SOURCE == "web" + when: manual + - when: on_success
1-4
: Reconsider globally disabling constraints.
PIP_CONSTRAINT: ""
encourages unconstrained upgrades in CI, reducing reproducibility and risking CUDA/TRT breakage on vendor images. Prefer per‑job opt‑out with justification, or switch to an image‑aligned constraints file.-.tests-default: - variables: - PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages +.tests-default: + variables: + PIP_CONSTRAINT: "tests/constraints/trtllm-25.06.txt" # default; override per-job if neededdocs/source/getting_started/_installation_for_Linux.rst (1)
39-45
: Fix LD_LIBRARY_PATH contents and provide aarch64 variant.
- Remove
/usr/include
(headers, not libs).- Provide ARM64 paths or mark snippet x86_64‑only. This was raised previously.
Suggested replacement:
- export PIP_CONSTRAINT="" - export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" - export PATH="${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" + export PIP_CONSTRAINT="" + # x86_64 + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + # aarch64 (ARM64/SBSA) + # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}" + # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
🧹 Nitpick comments (7)
.github/workflows/gpu_tests.yml (1)
73-76
: Remove /usr/include from LD_LIBRARY_PATH and guard paths.
/usr/include
is a headers directory, not a library path. Also guard appends with -d to avoid polluting envs when TRT isn’t present.Apply:
-echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" >> $GITHUB_ENV -echo "PATH=${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" >> $GITHUB_ENV +{ test -d /usr/lib/x86_64-linux-gnu && echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib/x86_64-linux-gnu" >> "$GITHUB_ENV"; } || true +{ test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/lib && echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" >> "$GITHUB_ENV"; } || true +{ test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/bin && echo "PATH=${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" >> "$GITHUB_ENV"; } || truedocs/source/getting_started/_installation_for_Linux.rst (4)
47-47
: Minor grammar tweak.“examples's” → “example’s” or “examples’”.
55-57
: Qualify “superior performance” claim or add context.State the scope (TensorRT inference workloads) and, if possible, link to a reference/benchmark; otherwise soften to “typically better suited”.
80-81
: Wording + brand name.
- “issues, we recommend using a docker image” → “if issues arise, prefer an NVIDIA container”.
- “HuggingFace” → “Hugging Face”.
89-90
: Remove stale reference to building ModelOpt docker image.Custom ModelOpt Dockerfile/images are deprecated in this PR; adjust this sentence accordingly.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all -optional dependencies pre-installed. +If you use the recommended NVIDIA containers, some may include ModelOpt pre‑installed; otherwise install via pip below..gitlab/tests.yml (2)
35-43
: Fix LD_LIBRARY_PATH (drop /usr/include) and add aarch64 note; guard paths.Same concern as GitHub workflow and docs; also make exports conditional.
-# Add libcudnn*.so and libnv*.so to path -- export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" +# Add common library paths (x86_64); guard if present +- test -d /usr/lib/x86_64-linux-gnu && export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}" +- test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/lib && export LD_LIBRARY_PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" +# aarch64 example (uncomment if using ARM runners) +# test -d /usr/lib/aarch64-linux-gnu && export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:${LD_LIBRARY_PATH}" -# Add trtexec to path -- export PATH="${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" +# Add trtexec to PATH if present +test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/bin && export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
59-59
: Use constraints to avoid clobbering vendor stacks.Unconstrained
".[hf,dev-test]"
can upgrade torch/transformers. Prefer constraints or--no-deps
if extras are tooling‑only.- - pip install ".[hf,dev-test]" + - pip install -c ${PIP_CONSTRAINT:-tests/constraints/trtllm-25.06.txt} ".[hf,dev-test]" +# or, if safe: +# - pip install --no-deps ".[dev-test]" +# - pip install -c ${PIP_CONSTRAINT} -r examples/$EXAMPLE/requirements.txt
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/example_tests.yml
(1 hunks).github/workflows/gpu_tests.yml
(1 hunks).gitlab/tests.yml
(2 hunks)docs/source/getting_started/_installation_for_Linux.rst
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/example_tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
- GitHub Check: wait-checks / wait
🔇 Additional comments (8)
docs/source/getting_started/_installation_for_Linux.rst (2)
33-36
: Good call on recommending TRT‑LLM release images.Consider adding a concrete version example (pin) and a brief note on matching the image to your CUDA/driver.
49-49
: Alt images section reads well.Clear options and links.
.gitlab/tests.yml (6)
28-28
: Good split: multi‑GPU default anchor.Makes downstream jobs cleaner.
44-49
: LGTM: tox‑current‑env for GPU tests.Reuses image stack and avoids creating extra envs.
52-54
: Example job factoring looks good.
57-57
: Matrix reduction is reasonable.
63-70
: Explicitly marking flaky QAT as allow_failure is fine.Consider adding an issue link for traceability.
71-79
: Pin looks good; verify alignment with “System requirements” table.Image tag
tensorrt-llm/release:1.1.0rc2.post2
should match doc requirements; otherwise CI/doc drift.
Signed-off-by: Keval Morabia <[email protected]>
d22bdab
to
54b6bcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
docs/source/getting_started/_installation_for_Linux.rst (4)
33-35
: Call out that TRT-LLM images can ship older ModelOpt; show check/upgrade path.Add a short note so users verify and (carefully) upgrade the preinstalled ModelOpt in TRT‑LLM images.
Apply this diff:
To use Model Optimizer with full dependencies (e.g. TensorRT/TensorRT-LLM deployment), we recommend using the `TensorRT-LLM docker image <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags>`_, e.g., ``nvcr.io/nvidia/tensorrt-llm/release:<version>``. + +.. note:: + These images may include an older ``nvidia-modelopt`` than PyPI. Check the version with:: + + pip show nvidia-modelopt | sed -n 's/^Version: //p' + + If you upgrade inside the container, follow the constraints guidance below to avoid breaking core CUDA/TRT/PyTorch packages.Run to confirm we mention this elsewhere:
#!/bin/bash rg -nC2 -i 'constraints|compatibility matrix|modelopt.*pre-?installed' docs/
49-56
: Clarify preinstalled versions; soften performance claim for TRT container.Note that preinstalled ModelOpt can lag; and avoid absolute “superior performance” wording.
Apply this diff:
For PyTorch, you can also use `NVIDIA NGC PyTorch container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags>`_ and for NVIDIA NeMo framework, you can use the `NeMo container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags>`_. Both of these containers come with Model Optimizer pre-installed. Make sure to update the Model Optimizer to the latest version if not already. - For ONNX / TensorRT use cases, you can also use the `TensorRT container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_ - which provides superior performance to the PyTorch container. + .. note:: + The preinstalled Model Optimizer version in these containers may be older than the latest release. Verify the version and + upgrade using a constraints file as described above. + + For ONNX / TensorRT use cases, you can also use the `TensorRT container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_, + which is optimized for TensorRT workloads and is recommended for TRT/ONNX deployment scenarios.
37-37
: Avoid unconstrained pip upgrades inside vendor images; recommend constraints.Unqualified -U risks breaking CUDA/TRT/PyTorch. Add a warning and a constrained install example.
Apply this diff:
- Make sure to upgrade Model Optimizer to the latest version using ``pip`` as described in the next section. + .. warning:: + When upgrading inside NVIDIA containers, use a constraints file aligned to the image stack to avoid upgrading + core libraries (``torch``, ``transformers``, CUDA/TRT bindings). + Example:: + + pip install -U -c /path/to/constraints.txt "nvidia-modelopt[all]" + + See the compatibility/constraints documentation for your chosen image.
41-46
: Fix env vars: remove /usr/include from LD_LIBRARY_PATH and provide aarch64 variant.“/usr/include” does not belong on LD_LIBRARY_PATH; and we need aarch64 paths per requirements.
Apply this diff:
- .. code-block:: shell - - export PIP_CONSTRAINT="" - export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" - export PATH="${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" + .. tabs:: + + .. tab:: x86_64 + + .. code-block:: shell + + export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" + + .. tab:: aarch64 (SBSA) + + .. code-block:: shell + + export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" + export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
🧹 Nitpick comments (4)
examples/diffusers/cache_diffusion/requirements.txt (2)
1-1
: Add a lower bound (or drop the pin in container flows) to avoid mismatched CUDA bindings.
cuda-python<13
protects against a future major, but pip may pick an older 11.x in non-container envs, or fight with preinstalled bindings in TRT-LLM/TRT/PyTorch images. Prefer>=12,<13
(or the minimum you test, e.g.,>=12.2,<13
). If examples are run only inside NVIDIA base images, consider omittingcuda-python
here to avoid downgrades.Apply:
-cuda-python<13 +cuda-python>=12,<13
2-5
: Centralize shared pins to prevent drift across examples.OpenCV, PEFT, Polygraphy, and SentencePiece pins recur in diffusers examples. Move common versions into a top-level constraints.txt and reference via
-c constraints.txt
from per-example requirements to keep them in sync.Would you like a follow-up PR patch that introduces constraints.txt and rewires examples to use it?
docs/source/getting_started/_installation_for_Linux.rst (2)
47-47
: Grammar + add actionable instruction.Tweak possessive and show how to install extras.
Apply this diff:
- You may need to install additional dependencies from the respective examples's `requirements.txt` file. + You may need to install additional dependencies from the respective examples' ``requirements.txt`` file, e.g.:: + + pip install -r examples/<example>/requirements.txt
80-81
: Tighten wording (clarity + brand).Minor phrasing edits; “Hugging Face” spelling.
Apply this diff:
- issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's - features locally for example, quantizing a HuggingFace model and then use a docker image for deployment. + issues, we recommend using a Docker image for a seamless experience. You may still choose to use other ModelOpt + features locally—for example, quantizing a Hugging Face model—and then use a Docker image for deployment.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/example_tests.yml
(1 hunks).github/workflows/gpu_tests.yml
(1 hunks).gitlab/tests.yml
(2 hunks)docs/source/getting_started/_installation_for_Linux.rst
(2 hunks)examples/diffusers/cache_diffusion/requirements.txt
(1 hunks)examples/diffusers/quantization/requirements.txt
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- examples/diffusers/quantization/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (3)
- .github/workflows/example_tests.yml
- .github/workflows/gpu_tests.yml
- .gitlab/tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (2)
examples/diffusers/cache_diffusion/requirements.txt (1)
1-1
: Confirmed: cuda-python constraint consistent across examples.
Both examples/diffusers/cache_diffusion/requirements.txt and examples/diffusers/quantization/requirements.txt use cuda-python<13.docs/source/getting_started/_installation_for_Linux.rst (1)
33-47
: Remove export of an empty PIP_CONSTRAINT — add constraints guidance next to the pip install stepReplace the exported empty env var and add this tip in docs/source/getting_started/_installation_for_Linux.rst (around lines 33–47):
.. tip:: If you are inside an NVIDIA container, install with a constraints file to preserve the container’s stack, e.g.:: pip install -U -c /opt/nvidia/constraints/<stack>/constraints.txt "nvidia-modelopt[all]"No constraints files were found in the repo; confirm whether the repo ships constraints and update the example path or remove the tip accordingly.
What does this PR do?
Testing
Summary by CodeRabbit
Documentation
Bug Fixes
Chores / CI
Other