Deprecate ModelOpt custom docker and directly use TRT-LLM / PyTorch / TRT docker #346

kevalmorabia97 · 2025-09-19T06:53:09Z

What does this PR do?

We dont need custom dockerfile as Pytorch / TRT-LLM / TRT Docker already has most things we need.

Testing

Full Internal CI passing: https://gitlab-master.nvidia.com/omniml/modelopt/-/pipelines/35204179

Summary by CodeRabbit

Documentation
- Reoriented install docs to a non-docker-first flow, prefer TensorRT‑LLM images, updated README/CHANGELOG/CONTRIBUTING and example READMEs, added deprecation note for custom Docker images.
Bug Fixes
- Ensured proper JSON serialization for llm_sparsity test data.
Chores / CI
- Removed local Docker build artifacts and helper scripts; CI switched to tag-based gating, consolidated example test matrix, and moved runtime environment setup into dedicated steps.
Other
- Removed CODEOWNERS entry for docker examples.

coderabbitai · 2025-09-19T06:53:21Z

Walkthrough

Removed repository Docker artifacts and ModelOpt Docker image references; updated installation docs to prefer TensorRT-LLM/TensorRT images and local virtualenvs; refactored GitLab CI to tag/manual gating and reorganized GPU/example jobs; moved GitHub Actions env setup into runtime steps; small example/test fixes (JSON list wrapping, removed one sparsity test case); removed CODEOWNERS /docker entry.

Changes

Cohort / File(s)	Summary of changes
Docker removal `docker/Dockerfile`, `docker/README.md`, `docker/build.sh`, `examples/onnx_ptq/docker/Dockerfile`, `examples/onnx_ptq/docker/build.sh`	Deleted Dockerfiles, Docker README, and build scripts; removed repository-maintained ModelOpt and ONNX-PTQ Docker artifacts and convenience build wrappers.
CI configuration updates `.gitlab/tests.yml`, `.github/workflows/example_tests.yml`, `.github/workflows/gpu_tests.yml`, `.github/CODEOWNERS`	GitLab: switched CI gating to tag/manual rules, added `PIP_CONSTRAINT` var, refactored GPU job to `.multi-gpu-tests-default` with before_script (LD_LIBRARY_PATH/PATH/git-lfs), consolidated and added tensorrt-backed example jobs and matrices (including allow_failure entries). GitHub: moved LD_LIBRARY_PATH/PATH setup from container env to a runtime "Setup environment variables" step; example-tests now install `".[hf,dev-test]"`. CODEOWNERS: removed `/docker` ownership entry.
Documentation and install guidance `CHANGELOG.rst`, `CONTRIBUTING.md`, `README.md`, `docs/source/getting_started/_installation_for_Linux.rst`	Added deprecation note for custom ModelOpt Docker images; updated install instructions to recommend TensorRT-LLM/TensorRT images and pip upgrade flows; added Linux requirements table, conda virtualenv guidance, and removed docker-first build/run flows in favor of environment-variable runtime guidance and image recommendations.
Example docs alignment `examples/.../*README.md`, `tests/examples/README.md`	Replaced references to ModelOpt Docker bundles across examples with TensorRT-LLM/TensorRT image guidance or installation docs; updated local install commands to use `pip -U` and added pre-test setup steps in tests/examples README; removed certain docker-specific notes and dockerfiles links.
Sparsity example and tests `examples/llm_sparsity/data_prep.py`, `tests/examples/llm_sparsity/test_llama_sparsify.py`	Wrapped serialized text data in explicit lists before JSON dump; removed the `("sparsegpt","bf16")` parametrized test case from sparsity tests.
Misc docs/test tweaks `examples/llm_autodeploy/README.md`, `examples/llm_distill/README.md`, `examples/llm_qat/README.md`, `examples/onnx_ptq/README.md`, `examples/speculative_decoding/README.md`, `examples/pruning/README.md`, `examples/diffusers/README.md`, `tests/examples/README.md`	Various README updates to point to installation docs, prefer TensorRT(-LLM) images, upgrade pip installs (`-U`), add Docker prereq guidance where appropriate, and harmonize local/test setup instructions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant GH as GitHub Actions
  participant Runner as Container
  participant Setup as "Setup environment variables" step
  participant Tests as "Run example tests"

  GH->>Runner: start example-tests-pr job
  Note over Runner: container env no longer sets LD_LIBRARY_PATH/PATH
  GH->>Setup: run "Setup environment variables" step
  Setup-->>GH: append LD_LIBRARY_PATH and PATH to $GITHUB_ENV
  GH->>Tests: run example tests with updated env
  Tests-->>GH: return results

sequenceDiagram
  autonumber
  participant GL as GitLab CI
  participant Rules as Rules (tag/manual)
  participant Multi as .multi-gpu-tests-default
  participant Example as example-torch / example-trtllm / example-onnx

  GL->>Rules: evaluate rules (tag =~ /^\d+\.\d+\.\d$/ or manual)
  alt matched
    GL->>Multi: start multi-GPU default (before_script sets env, installs git-lfs)
    Multi->>Example: spawn example jobs (parallel matrices)
    Example-->>GL: job results (some allow_failure)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A rabbit nibbles at the Docker trail,
Docs now point to TensorRT-LLM gale.
CI tags hum, env vars set in line,
JSON lists tucked in, one test declined.
Hop, test, release — a gentle deploy sign. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title clearly and concisely summarizes the primary change: deprecating the repository's custom ModelOpt Docker artifacts and switching to upstream TRT-LLM / PyTorch / TRT Docker images; this matches the deletions of Dockerfiles/build scripts and the documentation/CI updates in the changeset. It is specific, on-topic, and understandable at a glance for reviewers scanning PR history.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kmorabia/remove-dockerfile

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (13)

CONTRIBUTING.md (1)

14-14: LGTM. Link-only replacement is fine.

Optional: add a short hint about using the TensorRT-LLM release images as the recommended container baseline to keep this consistent with README.
examples/diffusers/README.md (4)
15-15: Fix typo: “cahce” → “cache”.

Apply:
-| Support Matrix | View the support matrix to see quantization/cahce diffusion compatibility and feature availability across different models | [[Link](#support-matrix)] | [[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/1_quantization.html)] |
+| Support Matrix | View the support matrix to see quantization/cache diffusion compatibility and feature availability across different models | [[Link](#support-matrix)] | [[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/1_quantization.html)] |
374-375: Grammar: “achieve” → “achieved”.

Also consider “is achieved” → “is achieved by”.
-  Significant speedup is achieve through skipping certain blocks at the specific steps.
+  Significant speedup is achieved by skipping certain blocks at specific steps.
379-379: Nit: trailing double period.
-... would..
+... would.
480-480: Fix typo: “requirments.txt” → “requirements.txt”.
-pip install -r eval/requirments.txt
+pip install -r eval/requirements.txt
CHANGELOG.rst (1)

4-4: Finalize release date before publishing 0.37.

Replace “2025-09-xx” with the actual date (e.g., 2025-09-19) at release cut.
examples/llm_qat/README.md (2)
32-33: LGTM; optional clarifications.

Consider adding a short note that NVIDIA NGC “tensorrt-llm/release:” images are the recommended base with ModelOpt included.

Link to the exact section in installation docs that covers env PATH/LD_LIBRARY_PATH inside containers.

314-314: Naming consistency: use “TensorRT-LLM” (or TRT-LLM) consistently.

“TRTLLM” (no dash) appears here; prefer “TRT-LLM”.
-# To run QAT model with TRTLLM, run:
+# To run QAT model with TRT-LLM, run:
examples/llm_ptq/README.md (1)

35-36: LGTM; mirrors QAT README update.

Optional: add a one-liner pointing to recommended NGC image (tensorrt-llm/release:).

tests/examples/README.md (1)

12-18: LGTM; add a note on auth to avoid rate limits.

Consider mentioning HF auth (HF_TOKEN/HF_HOME) for tests that pull from Hugging Face to reduce flakiness.

examples/onnx_ptq/README.md (1)

27-27: Clarify example-scoped Dockerfile and update README.
examples/onnx_ptq/docker/Dockerfile exists; README (examples/onnx_ptq/README.md lines 25–33) still links ./docker/Dockerfile and invokes ./docker/build.sh. Update the README to explicitly state this Dockerfile is specific to the ONNX PTQ example (not repo-wide); if repo-level Docker is deprecated, replace these instructions with the recommended NGC image guidance or document the deprecation.
docs/source/getting_started/_installation_for_Linux.rst (2)
48-48: Minor: Wording improvement needed.

The phrase "Alternative NVIDIA docker images" could be clearer.
-    **Alternative NVIDIA docker images**
+    **Alternative NVIDIA Docker images**
89-92: Verify consistency with docker image deprecation.

The text mentions "If you build and use ModelOpt's docker image" but the PR is deprecating the custom ModelOpt docker. This section should be updated to reflect the new reality.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all
-optional dependencies pre-installed.
-If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies.
+If you use the suggested TensorRT-LLM or other NVIDIA docker images, ModelOpt may be pre-installed with some of the below optional dependencies.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c36abe and fabbe02.

📒 Files selected for processing (14)

.github/CODEOWNERS (0 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)

💤 Files with no reviewable changes (4)

.github/CODEOWNERS
docker/build.sh
docker/README.md
docker/Dockerfile

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-16T20:14:34.768Z

Learnt from: realAsma
PR: NVIDIA/TensorRT-Model-Optimizer#318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:191-191
Timestamp: 2025-09-16T20:14:34.768Z
Learning: The TensorRT-Model-Optimizer project only supports PyTorch >= 2.6, so using the `weights_only` parameter in torch.load calls is acceptable and doesn't require backward compatibility handling.

Applied to files:

CHANGELOG.rst

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (14)

examples/diffusers/README.md (1)

40-40: LGTM.

Matches repo-wide guidance to centralize install steps in the installation docs.

README.md (2)

64-64: LGTM.

Using “-U” to upgrade is a helpful tweak.

67-67: LGTM.

Wording is clearer (“latest features”).

examples/llm_ptq/README.md (1)

127-135: Staleness check on footnote 5 — no update required (keep "TensorRT‑LLM v0.17 or later").
v0.17.0 introduced Blackwell and NVFP4 support; subsequent 0.x releases exist and no v1.x TensorRT‑LLM release was found — optionally change to "v0.17.0 or later" for precision. (github.com)

CHANGELOG.rst (1)

9-9: Align TensorRT-LLM version strings across the repo

Mixed references to "1.1.0rc2" and "1.1.0rc2.post2" were found (CHANGELOG.rst New Features, README/docs table, example scripts/notebooks using nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2 — e.g. examples/llm_ptq scripts/huggingface_example.sh, plus CI/version checks). Choose a single canonical tag (match CI/packaging), update version checks and docker image tags/docs, and re-run a grep to confirm no remaining mismatches.

docs/source/getting_started/_installation_for_Linux.rst (3)

5-26: LGTM! Great addition of system requirements table.

The new system requirements table provides clear guidance on supported OS, architecture, Python versions, and dependencies. This aligns well with the shift to TensorRT-LLM docker images.

33-37: LGTM! Clear migration to TensorRT-LLM docker image.

The updated guidance properly directs users to the TensorRT-LLM docker image instead of a custom ModelOpt docker, which aligns with the PR objectives to deprecate the custom dockerfile.

54-55: LGTM! Clear guidance for ONNX PTQ users.

The reference to the onnx_ptq Dockerfile and mention of the latest TensorRT version provides helpful context for users who need ONNX PTQ functionality.

.gitlab/tests.yml (6)

1-4: LGTM! Proper addition of PIP_CONSTRAINT variable.

The addition of the PIP_CONSTRAINT: "" variable to disable pip constraints aligns with the upgrade-focused workflow mentioned in the PR changes.

6-8: LGTM! Simplified rules configuration.

The simplified rules structure is cleaner and maintains the same functionality - scheduled pipelines run automatically, others are manual.

27-29: LGTM! Job rename and timeout adjustment.

The rename from gpu to multi-gpu is more descriptive, and the increased timeout from 60m to 90m provides better reliability for multi-GPU tests.

44-48: LGTM! Migration to TensorRT-LLM docker image.

The switch to nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 aligns perfectly with the PR objectives to use TensorRT-LLM docker instead of custom ModelOpt docker. The environment variables properly set up TensorRT paths.

55-60: LGTM! Sophisticated failure handling logic.

The new rules configuration provides granular control over test failures - allowing failures only for specific examples when scheduled, which is a significant improvement over the previous blanket allow_failure approach.

50-50: Verify EXAMPLE_ALLOWED_FAILURES is defined or fix the substitution

.gitlab/tests.yml (≈line 50): ALLOWED_FAILURES: ${EXAMPLE_ALLOWED_FAILURES:llm_qat}

Repo search returned no matches for EXAMPLE_ALLOWED_FAILURES — either add this key to your GitLab CI/CD variables (project/group/instance) or define a default in the repo.

The expression ${EXAMPLE_ALLOWED_FAILURES:llm_qat} looks incorrect for a default and may not perform bash-style fallback in GitLab/YAML; move default logic into the job script or predefine the variable. (devgem.io)

docs/source/getting_started/_installation_for_Linux.rst

README.md

codecov · 2025-09-19T07:18:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.84%. Comparing base (4c36abe) to head (54b6bcf).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #346   +/-   ##
=======================================
  Coverage   73.83%   73.84%           
=======================================
  Files         172      172           
  Lines       17453    17453           
=======================================
+ Hits        12887    12888    +1     
+ Misses       4566     4565    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

docs/source/getting_started/_installation_for_Linux.rst (1)
43-44: Add aarch64 (ARM64) TensorRT env var examples or mark the snippet as x86_64-only.

The TensorRT installation script detects architecture with "if [ "$ARCH" = "aarch64" ];then ARCH="sbsa";fi", indicating that aarch64 systems use "sbsa" in TensorRT paths. The documentation shows x86_64-only paths but should either include aarch64 equivalents or explicitly state it's x86_64-only.

Apply this diff to add aarch64 examples:
 You would also need to setup appropriate environment variables for the TensorRT binaries as follows:

     .. code-block:: shell

+        # For x86_64 systems:
         export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
         export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        
+        # For aarch64 systems:
+        export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fabbe02 and 613df8d.

📒 Files selected for processing (15)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)

💤 Files with no reviewable changes (4)

docker/README.md
.github/CODEOWNERS
docker/build.sh
docker/Dockerfile

✅ Files skipped from review due to trivial changes (1)

examples/llm_ptq/README.md

🚧 Files skipped from review as they are similar to previous changes (7)

CONTRIBUTING.md
examples/diffusers/README.md
examples/llm_qat/README.md
CHANGELOG.rst
examples/onnx_ptq/README.md
tests/examples/README.md
README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (6)

docs/source/getting_started/_installation_for_Linux.rst (2)

33-37: LGTM! Clear migration from custom to TensorRT-LLM Docker.

The documentation effectively communicates the shift from ModelOpt's custom Docker to the standardized TensorRT-LLM image, which aligns with the PR objectives.

54-55: ONNX PTQ Dockerfile reference verified
examples/onnx_ptq/docker/Dockerfile is present; no changes needed.

.gitlab/tests.yml (4)

3-4: LGTM! Proper pip constraint handling.

Adding PIP_CONSTRAINT: "" allows package upgrades without constraint conflicts, which is appropriate for testing the latest ModelOpt version.

55-57: Environment variables properly configured for GitLab CI.

The environment setup correctly mirrors the GitHub workflow changes, ensuring consistent TensorRT paths across both CI systems.

65-71: Good isolation of failing tests.

Moving llm_qat to a separate allow_failure: true job prevents it from blocking the main pipeline while tracking the issue for future resolution.

28-28: Job renamed appropriately for clarity.

The rename from gpu to multi-gpu better reflects the job's purpose of running tests on 2-GPU configurations.

.github/workflows/example_tests.yml

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

docs/source/getting_started/_installation_for_Linux.rst (1)

39-44: Architecture-specific environment variables need clarification.

This is a known issue from previous reviews. The environment variable examples are x86_64-specific, but the system requirements table (line 13) indicates support for both x86_64 and aarch64 (SBSA) architectures.

🧹 Nitpick comments (2)

docs/source/getting_started/_installation_for_Linux.rst (1)
89-92: Verify consistency with docker deprecation strategy.

The text still references "ModelOpt's docker image" (line 89) and suggests building it, which contradicts the PR's objective of deprecating the custom ModelOpt dockerfile. This language should be updated to reflect that the custom dockerfile is no longer recommended.
Based on the search results, I can see that TensorRT-LLM docker images do include ModelOpt as a dependency. The text should be updated to reflect the deprecation of the custom ModelOpt dockerfile.

Apply this diff to update the text:
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all
-optional dependencies pre-installed.
-If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies.
+If you use the recommended TensorRT-LLM docker image or other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies.
.gitlab/tests.yml (1)
8-9: Tag-based rule may need semantic versioning validation.

The regex pattern ^\d+\.\d+\.\d+$ matches basic semantic versioning but doesn't account for pre-release tags (like rc, alpha, beta) that might be common in this project.

Consider using a more comprehensive regex pattern that includes pre-release identifiers:
-    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
+    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+(-[a-zA-Z0-9\.\-]+)?$/

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 613df8d and 39b57bc.

📒 Files selected for processing (16)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)
tests/examples/llm_qat/test_llm_qat.py (3 hunks)

💤 Files with no reviewable changes (4)

.github/CODEOWNERS
docker/build.sh
docker/README.md
docker/Dockerfile

🚧 Files skipped from review as they are similar to previous changes (9)

examples/llm_ptq/README.md
CHANGELOG.rst
CONTRIBUTING.md
examples/diffusers/README.md
examples/onnx_ptq/README.md
README.md
tests/examples/README.md
examples/llm_qat/README.md
.github/workflows/example_tests.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (12)

docs/source/getting_started/_installation_for_Linux.rst (3)

33-37: Clear documentation improvement aligning with PR objectives.

The updated text clearly guides users toward the TensorRT-LLM docker image as the recommended approach, which aligns perfectly with the PR's goal of deprecating the custom ModelOpt dockerfile. The guidance to upgrade Model Optimizer if needed is practical and helpful.

48-48: Minor wording improvement enhances readability.

The change from "Using alternative NVIDIA docker images" to "Alternative NVIDIA docker images" removes unnecessary verbosity and improves the section header clarity.

54-55: ONNX PTQ guidance provides valuable context.

The updated text effectively explains the benefit of using the onnx_ptq Dockerfile - access to the latest TensorRT version with cutting-edge features. This helps users understand why they might choose this option over alternatives.

tests/examples/llm_qat/test_llm_qat.py (4)

39-45: Backend test coverage is properly adjusted for CI stability.

The xfail markings align with the stable v1.1.0rc2.post2 release while maintaining coverage for working backends (fsdp2, deepspeed). This prevents CI failures while preserving test value for stable backends.

71-76: Consistent backend parameterization applied correctly.

The same backend handling approach is consistently applied across both QAT test variants, maintaining test coverage while acknowledging known issues.

97-97: Explicit backend specification improves test reliability.

Adding the --backend fsdp2 parameter ensures the test runs with a stable backend configuration, improving test predictability.

102-102: Test failure handling improved with xfail approach.

Changing from skip to xfail is beneficial as it still executes the test and captures the failure, providing better visibility into the actual issue versus simply skipping it.
.gitlab/tests.yml (5)
59-59: Installation command includes correct package extras.

The pip install ".[all,dev-test]" command properly installs the package with all required extras for testing, which is appropriate for the CI environment.

1-1: Documentation reference verified — no action required.
All three referenced workflows (.github/workflows/unit_tests.yml, .github/workflows/gpu_tests.yml, .github/workflows/example_tests.yml) exist and include the consistency note referencing .gitlab/tests.yml.

3-4: Verify and justify disabling pip constraints in CI

PIP_CONSTRAINT: "" disables pip constraints — this risks non-reproducible installs, dependency conflicts, and pulling unvetted/vulnerable package versions.

Location: .gitlab/tests.yml (lines 3–4).

If accidental: revert to the repository constraints value.

If intentional: limit the change to a single job or guard it with an explicit opt‑in env var and document why.

Before merging: run full CI, reproduce installs in a clean environment, run dependency checks (pip check / pip-tools / pipdeptree) and a security audit (pip-audit or SCA) to confirm no resolution errors or CVEs.

46-46: Verify TensorRT-LLM Docker image availability and compatibility

.gitlab/tests.yml:46 — image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2

Sandbox run failed (docker not available; manifest inspect failed). Manually verify the image is pullable and contains the runtime/tools required for ModelOpt tests.

Run locally:
docker manifest inspect nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
docker run --rm nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 bash -c "python --version; ls -la /usr/local/tensorrt/targets/x86_64-linux-gnu/bin | head -5; echo \$LD_LIBRARY_PATH"
Checks: manifest returns success; Python present; TensorRT bin directory contains expected tools; LD_LIBRARY_PATH includes TensorRT libraries.

55-57: Verify TensorRT library paths exist in the CI/container

Verification failed: "Cannot verify paths (image may not be available locally)". Ensure the CI runner pulls/uses the expected image and verify these paths exist at runtime.

File: .gitlab/tests.yml Lines: 55-57
    - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
    # Add trtexec to path
    - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"
Run locally or in CI to check:
docker pull nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
docker run --rm nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 bash -c "\
  ls -la /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/ 2>/dev/null || echo 'TensorRT lib path not found'; \
  [ -x /usr/local/tensorrt/targets/x86_64-linux-gnu/bin/trtexec ] && echo 'trtexec found' || echo 'trtexec not found'; \
  ls -la /usr/lib/x86_64-linux-gnu/libcudnn* 2>/dev/null || echo 'cudnn not found' \
"
If the paths/tools are absent, update the job to pull the correct image, make the exports conditional (check for dirs/files before exporting), or detect the correct runtime paths and update .gitlab/tests.yml.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

.gitlab/tests.yml (4)
3-4: Don’t globally disable pip constraints.

Setting PIP_CONSTRAINT="" at the template level weakens reproducibility for all jobs and can unintentionally upgrade preinstalled libs in vendor images (e.g., TRT‑LLM), risking ABI mismatches.

Apply this to scope the behavior only where needed (or drop it entirely if accidental):
 .tests-default:
-  variables:
-    PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
+  variables: {}
@@
 example:
   extends: .tests-default
+  variables:
+    # If you intentionally want to bypass constraints only for examples, keep it scoped here.
+    PIP_CONSTRAINT: ""
If you do have a constraints file, prefer pinning instead:
-    - pip install ".[all,dev-test]"
+    - pip install -c tests/constraints.txt ".[all,dev-test]"
36-37: Export LD_LIBRARY_PATH earlier and guard for missing paths.

Doing this inside script works, but guarding avoids polluting LD_LIBRARY_PATH with non‑existent dirs and improves portability across image variants.
 multi-gpu:
@@
-  script:
-    # Add libcudnn*.so and libnv*.so to path
-    - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+  before_script:
+    # Add libcudnn*.so and libnv*.so to path (guard if dirs exist)
+    - for d in /usr/lib/x86_64-linux-gnu /usr/local/tensorrt/targets/x86_64-linux-gnu/lib; do
+        [ -d "$d" ] && export LD_LIBRARY_PATH="$d:${LD_LIBRARY_PATH}";
+      done
   script:
46-47: Pin or parameterize the TRT‑LLM image; avoid RC drift.

Using an RC tag (1.1.0rc2.post2) can change over time or be removed. Prefer a digest, or at least parameterize via a variable to centralize updates.
 example:
   extends: .tests-default
   stage: tests
   timeout: 30m
-  image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
+  variables:
+    TRT_LLM_IMAGE: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
+  image: $TRT_LLM_IMAGE
If possible, pin by digest for reproducibility:
image: nvcr.io/nvidia/tensorrt-llm/release@sha256:<digest>
Also ensure the CI runner has pull access to nvcr.io for this repo.

55-58: Harden path exports; avoid adding non‑existent dirs.

Minor portability tweak to keep envs clean across images.
   before_script:
-    # Add libcudnn*.so and libnv*.so to path
-    - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-    # Add trtexec to path
-    - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"
+    # Add libcudnn*.so and libnv*.so to path
+    - for d in /usr/lib/x86_64-linux-gnu /usr/local/tensorrt/targets/x86_64-linux-gnu/lib; do
+        [ -d "$d" ] && export LD_LIBRARY_PATH="$d:${LD_LIBRARY_PATH}";
+      done
+    # Add trtexec to path
+    - if [ -d /usr/local/tensorrt/targets/x86_64-linux-gnu/bin ]; then export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"; fi

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39b57bc and e8b201f.

📒 Files selected for processing (16)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)
tests/examples/llm_qat/test_llm_qat.py (3 hunks)

💤 Files with no reviewable changes (4)

docker/build.sh
docker/Dockerfile
.github/CODEOWNERS
docker/README.md

🚧 Files skipped from review as they are similar to previous changes (11)

CONTRIBUTING.md
tests/examples/llm_qat/test_llm_qat.py
examples/diffusers/README.md
examples/llm_ptq/README.md
README.md
.github/workflows/example_tests.yml
examples/llm_qat/README.md
examples/onnx_ptq/README.md
docs/source/getting_started/_installation_for_Linux.rst
tests/examples/README.md
CHANGELOG.rst

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs

🔇 Additional comments (2)

.gitlab/tests.yml (2)

1-1: Good cross‑repo consistency note.

Keeping CI config aligned with GitHub workflows is helpful. Please maintain this in future edits.

28-28: Job rename to “multi-gpu” makes intent clearer.

Name better reflects resource use. No issues spotted.

.gitlab/tests.yml

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

tests/examples/llm_qat/test_llm_qat.py (3)

97-98: Pinned fsdp2 for LoRA QAT: confirm intent; consider parity with other tests.

If stability permits, consider parameterizing over runnable backends to keep parity with other QAT tests; otherwise keep fsdp2 pinned.

-def test_llama_lora_qat_nvfp4(tiny_llama_path, tmp_path):
+@pytest.mark.parametrize("backend", ["fsdp2", "deepspeed"])
+def test_llama_lora_qat_nvfp4(tiny_llama_path, tmp_path, backend):
@@
-            "--output_dir", tmp_path / "lora_qat",
-            "--backend", "fsdp2",
+            "--output_dir", tmp_path / "lora_qat",
+            "--backend", backend,

102-103: Make xfail non-strict and reference a tracking issue.

Prevents CI failures on unexpected pass if project-wide strict xfail is enabled and adds traceability.

-@pytest.mark.xfail(reason="FIXME: QLoRa test failure")
+@pytest.mark.xfail(strict=False, reason="FIXME: QLoRa test failure; see #<issue>")

41-45: Consolidate backend parametrization and add tracker IDs

Skips for fsdp1/ddp are appropriate given hangs; align PR text (allowed-to-fail vs skip), add tracker issue IDs in skip reasons, and de-duplicate the backend list into a shared BACKENDS constant reused by both parametrize decorators. Verified examples/llm_qat/launch.sh accepts BACKEND values: fsdp1, fsdp2, ddp, deepspeed.

@@
-import pytest
+import pytest
@@
-@pytest.mark.parametrize("backend", [
-    pytest.param("fsdp1", marks=pytest.mark.skip(reason="FIXME: fsdp1 test hangs")),
-    "fsdp2",
-    "deepspeed",
-    pytest.param("ddp", marks=pytest.mark.skip(reason="FIXME: ddp test hangs")),
-])
+BACKENDS = [
+    pytest.param("fsdp1", marks=pytest.mark.skip(reason="FIXME: fsdp1 test hangs; see #<issue>"), id="fsdp1"),
+    pytest.param("fsdp2", id="fsdp2"),
+    pytest.param("deepspeed", id="deepspeed"),
+    pytest.param("ddp", marks=pytest.mark.skip(reason="FIXME: ddp test hangs; see #<issue>"), id="ddp"),
+]
+@pytest.mark.parametrize("backend", BACKENDS)
 def test_llama_qat_int4w_int8a(tiny_llama_path, tmp_path, backend):
@@
-@pytest.mark.parametrize("backend", [
-    pytest.param("fsdp1", marks=pytest.mark.skip(reason="FIXME: fsdp1 test hangs")),
-    "fsdp2",
-    "deepspeed",
-    pytest.param("ddp", marks=pytest.mark.skip(reason="FIXME: ddp test hangs")),
-])
+@pytest.mark.parametrize("backend", BACKENDS)
 def test_llama_qat_int4w_int8a_direct_qat(tiny_llama_path, tmp_path, backend):

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8b201f and 7fd1f74.

📒 Files selected for processing (16)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)
tests/examples/llm_qat/test_llm_qat.py (3 hunks)

💤 Files with no reviewable changes (4)

docker/build.sh
.github/CODEOWNERS
docker/Dockerfile
docker/README.md

✅ Files skipped from review due to trivial changes (1)

README.md

🚧 Files skipped from review as they are similar to previous changes (10)

examples/diffusers/README.md
examples/llm_ptq/README.md
examples/llm_qat/README.md
CONTRIBUTING.md
CHANGELOG.rst
examples/onnx_ptq/README.md
.gitlab/tests.yml
tests/examples/README.md
.github/workflows/example_tests.yml
docs/source/getting_started/_installation_for_Linux.rst

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (1)

tests/examples/llm_qat/test_llm_qat.py (1)

39-39: Whitespace-only change.

No action needed.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/source/getting_started/_installation_for_Linux.rst (1)
89-93: Stale reference to ModelOpt’s custom docker image.

This PR deprecates the custom ModelOpt image; these lines still refer to building/using it. Replace with TRT‑LLM image guidance.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all
-optional dependencies pre-installed.
-If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies.
+If you use the recommended TensorRT‑LLM docker image, ModelOpt is pre-installed with many optional dependencies.
+For other suggested NVIDIA images, ModelOpt may be pre-installed with some optional dependencies.

♻️ Duplicate comments (3)

docs/source/getting_started/_installation_for_Linux.rst (1)

43-44: Add aarch64 equivalents or mark snippet as x86_64-only.

The env exports are x86_64-specific. Add aarch64 variants or a note that these lines target x86_64 only.

Apply one of the following:

     .. code-block:: shell

-        export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # x86_64
+        export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # aarch64 (SBSA)
+        # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"

.gitlab/tests.yml (2)

58-58: Avoid unconstrained extras installs in vendor images.

This can upgrade torch/transformers and break CUDA/TRT compat. Use constraints or no-deps; install example reqs first.

-    - pip install ".[all,dev-test]"
+    # Prefer pinned installs to avoid clobbering preinstalled CUDA/TRT stack
+    - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -c "$PIP_CONSTRAINT" -r "$req_file" || exit 1; done
+    - pip install -c "$PIP_CONSTRAINT" ".[dev-test]"   # or: pip install --no-deps ".[dev-test]"

6-9: Unconditional “when: manual” disables default MR/branch pipelines.

Make on_success the default; keep manual only for explicit sources.

 rules:
   - if: $CI_PIPELINE_SOURCE == "schedule"
-  - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
-  - when: manual
+    when: on_success
+  - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/  # release tags
+    when: on_success
+  - if: $CI_PIPELINE_SOURCE == "web"         # allow manual web runs
+    when: manual
+  - when: on_success                         # default for branches/MRs

🧹 Nitpick comments (7)

docs/source/getting_started/_installation_for_Linux.rst (3)
33-36: Pin a known-good TRT-LLM tag in the example.

Suggest showing a concrete, tested tag to avoid drift with deps in docs and CI.
-    e.g., ``nvcr.io/nvidia/tensorrt-llm/release:<version>``.
+    e.g., ``nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2`` (update as releases advance).
37-37: Grammar nit.

“if not already” → “if not already installed” or “if not already up to date”.
-    You may upgrade the Model Optimizer to the latest version if not already as described in the next section.
+    You may upgrade Model Optimizer to the latest version if not already installed, as described in the next section.
46-46: Typo: possessive.

“examples's” → “example’s” (or “examples’” if plural).
-    You may need to install additional dependencies from the respective examples's `requirements.txt` file.
+    You may need to install additional dependencies from the respective example’s `requirements.txt` file.
.gitlab/tests.yml (4)
3-4: Reconsider disabling pip constraints globally.

Blank PIP_CONSTRAINT invites uncontrolled upgrades inside vendor images; prefer constraints aligned to image baselines.
 .tests-default:
   variables:
-    PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
+    PIP_CONSTRAINT: "tests/constraints/trt-llm-1.1.0rc2.post2.txt"  # Pin to image baseline; update per image
If no constraints file exists, generate and check one in per image/tag.

35-39: Arch-specific TensorRT paths; guard for aarch64 or absent installs.

Make the exports conditional on arch and directory existence to avoid broken PATH/LD paths on ARM runners or non‑TRT images.
-    # Add libcudnn*.so and libnv*.so to path
-    - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-    # Add trtexec to path
-    - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"
+    # Add TensorRT libs/tools if present (x86_64/aarch64)
+    - ARCH="$(uname -m)"; [ "$ARCH" = "aarch64" ] && GNU_ARCH="aarch64-linux-gnu" || GNU_ARCH="x86_64-linux-gnu"
+    - TRT_ROOT="/usr/local/tensorrt/targets/$GNU_ARCH"
+    - if [ -d "$TRT_ROOT/lib" ]; then export LD_LIBRARY_PATH="/usr/lib/$GNU_ARCH:$TRT_ROOT/lib:${LD_LIBRARY_PATH}"; fi
+    - if [ -d "$TRT_ROOT/bin" ]; then export PATH="$TRT_ROOT/bin:$PATH"; fi
31-39: Consider standardizing GPU images on TRT‑LLM.

Given the project shift, switching .gpu-tests-default to the TRT‑LLM image can reduce env drift and simplify TRT tooling setup.
-  image: nvcr.io/nvidia/pytorch:25.06-py3
+  image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
If Megatron/torch baselines require the PyTorch image, document the exception.

67-67: Parameterize TRT‑LLM image tag.

Keep docs/CI in sync and ease upgrades.
-  image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
+  image: nvcr.io/nvidia/tensorrt-llm/release:${TRT_LLM_TAG:-1.1.0rc2.post2}
Define TRT_LLM_TAG at pipeline level to bump centrally.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7fd1f74 and 41d36ff.

📒 Files selected for processing (16)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)
tests/examples/llm_qat/test_llm_qat.py (3 hunks)

💤 Files with no reviewable changes (4)

docker/build.sh
docker/README.md
.github/CODEOWNERS
docker/Dockerfile

🚧 Files skipped from review as they are similar to previous changes (10)

examples/llm_qat/README.md
CHANGELOG.rst
examples/diffusers/README.md
README.md
CONTRIBUTING.md
examples/llm_ptq/README.md
tests/examples/llm_qat/test_llm_qat.py
examples/onnx_ptq/README.md
tests/examples/README.md
.github/workflows/example_tests.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.gitlab/tests.yml (1)

58-64: Skip fsdp1 paths for llm_qat in CI (prevent blocking hangs)

EXAMPLE matrix in .gitlab/tests.yml still includes llm_qat; no pytest skip/xfail for fsdp/fsdp1 was found in tests/examples or tests, and tests/examples/test_llm_qat.sh is missing. Add skip/xfail for fsdp1 paths or remove llm_qat from the matrix to avoid CI hangs.

♻️ Duplicate comments (2)

.gitlab/tests.yml (2)

6-9: Fix rules: current config makes most pipelines manual-only.

Unconditional "when: manual" at Line 9 forces branch/MR runs to be manual. Make auto the default and keep a separate manual path for web-triggered runs.

Apply this diff:

 .tests-default:
   rules:
-    - if: $CI_PIPELINE_SOURCE == "schedule"
-    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
-    - when: manual
+    - if: $CI_PIPELINE_SOURCE == "schedule"
+      when: on_success
+    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/  # release tags
+      when: on_success
+    - if: $CI_PIPELINE_SOURCE == "web"         # allow manual reruns from web
+      when: manual
+    - when: on_success                         # default for other pipelines

3-4: Avoid unconstrained installs on vendor images; wire PIP_CONSTRAINT and reorder installs.

Unpinned extras can upgrade core libs inside TRT‑LLM/NGC images. Pass an optional constraints file and install example requirements first.

Apply this diff:

 .tests-default:
   variables:
-    PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
+    PIP_CONSTRAINT: "" # If set, passed to pip (-c "$PIP_CONSTRAINT") to avoid upgrading vendor stack
@@
 example:
   extends: .gpu-tests-default
   timeout: 30m
@@
   script:
-    - pip install ".[all,dev-test]"
+    # Install example-specific deps first, under constraints if provided
+    - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install ${PIP_CONSTRAINT:+-c "$PIP_CONSTRAINT"} -r "$req_file" || exit 1; done
+    # Then install project dev/test deps without clobbering CUDA/TensorRT; avoid "all" extras
+    - pip install ${PIP_CONSTRAINT:+-c "$PIP_CONSTRAINT"} ".[dev-test]"
@@
-    - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -r "$req_file" || exit 1; done
+    # (moved above)

Also applies to: 60-64

🧹 Nitpick comments (1)

.gitlab/tests.yml (1)

35-41: Guard TensorRT PATH exports or move them to TRT‑LLM images.

These paths may not exist in nvcr.io/nvidia/pytorch:25.06‑py3. No harm exporting, but it’s cleaner to gate them.

Apply this diff:

   before_script:
-    # Add libcudnn*.so and libnv*.so to path
-    - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-    # Add trtexec to path
-    - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"
+    # Add TensorRT libs/tools only if present in the image
+    - if [ -d "/usr/local/tensorrt/targets/x86_64-linux-gnu" ]; then \
+        export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"; \
+        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"; \
+      fi
     # Install git-lfs for Daring-Anteater dataset
     - apt-get update && apt-get install -y git-lfs

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41d36ff and 96b897f.

📒 Files selected for processing (16)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (3 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)
tests/examples/llm_qat/test_llm_qat.py (3 hunks)

💤 Files with no reviewable changes (4)

docker/Dockerfile
docker/README.md
.github/CODEOWNERS
docker/build.sh

✅ Files skipped from review due to trivial changes (2)

examples/diffusers/README.md
examples/onnx_ptq/README.md

🚧 Files skipped from review as they are similar to previous changes (9)

CONTRIBUTING.md
README.md
.github/workflows/example_tests.yml
docs/source/getting_started/_installation_for_Linux.rst
tests/examples/llm_qat/test_llm_qat.py
tests/examples/README.md
CHANGELOG.rst
examples/llm_ptq/README.md
examples/llm_qat/README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs

🔇 Additional comments (1)

.gitlab/tests.yml (1)

52-54: Switch example job to TRT‑LLM image (nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc3) and verify CI can pull it

example currently inherits nvcr.io/nvidia/pytorch:25.06‑py3 from .gpu-tests-default; replace with the TRT‑LLM image and prefer 1.1.0rc3 (NGC shows rc3 is newer than rc2.post2) or wait for an official 1.1.x stable. Confirm CI runners have access/credentials to pull nvcr.io images.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

.gitlab/tests.yml (2)
52-66: Consider constraining pip installs to avoid TRT-LLM compatibility issues.

Installing unconstrained ".[all,dev-test]" can upgrade core libraries (torch/transformers) in vendor TRT‑LLM images and cause CUDA/TensorRT incompatibilities.

Options to mitigate:

Use constraints file: pip install -c constraints.txt ".[all,dev-test]"

Use --no-deps if extras only add tooling

Install per-example requirements before extras to minimize upgrades

Example approach:
-    - pip install ".[all,dev-test]"
+    # Install with constraints to avoid clobbering preinstalled CUDA/TensorRT stack
+    - pip install -c tests/constraints.txt ".[all,dev-test]"
8-9: Fix .tests-default rules so jobs run by default (remove catch‑all manual)

The unconditional - when: manual rule as the last entry makes all non-schedule/non-tag pipelines (branches/MRs) manual by default. This defeats automatic CI execution on branch pushes and merge requests.

Apply this diff to fix the rules order:
 .tests-default:
   rules:
     - if: $CI_PIPELINE_SOURCE == "schedule"
-    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
-    - when: manual
+      when: on_success
+    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/  # release tags
+      when: on_success
+    - if: $CI_PIPELINE_SOURCE == "web"         # allow manual reruns from web
+      when: manual
+    - when: on_success                         # default for all other pipelines

🧹 Nitpick comments (2)

examples/llm_sparsity/data_prep.py (2)
59-62: Write with explicit UTF‑8; consider dropping pretty‑print and/or JSONL streaming for scale.

Since ensure_ascii=False, default locale encodings can corrupt output on some systems; open files with encoding="utf-8".

CNN/DailyMail splits are large; indent=4 bloats files and I/O. Prefer compact JSON or JSONL streaming.

Minimal patch:
-with open(os.path.join(args.save_path, "cnn_train.json"), "w") as write_f:
-    json.dump(list(tokenized_dataset["train"]["text"]), write_f, indent=4, ensure_ascii=False)
+with open(os.path.join(args.save_path, "cnn_train.json"), "w", encoding="utf-8") as write_f:
+    json.dump(list(tokenized_dataset["train"]["text"]), write_f, ensure_ascii=False)
 with open(os.path.join(args.save_path, "cnn_eval.json"), "w") as write_f:
-    json.dump(list(tokenized_dataset["test"]["text"]), write_f, indent=4, ensure_ascii=False)
+    json.dump(list(tokenized_dataset["test"]["text"]), write_f, ensure_ascii=False)
Optional JSONL streaming (memory-lean, faster diffs):
with open(os.path.join(args.save_path, "cnn_train.jsonl"), "w", encoding="utf-8") as f:
    for row in tokenized_dataset["train"]:
        f.write(json.dumps(row["text"], ensure_ascii=False) + "\n")
with open(os.path.join(args.save_path, "cnn_test.jsonl"), "w", encoding="utf-8") as f:
    for row in tokenized_dataset["test"]:
        f.write(json.dumps(row["text"], ensure_ascii=False) + "\n")
61-62: Naming nit: “eval” file is built from the test split.

Either rename to cnn_test.json or switch to the validation split for eval artifacts to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fee946 and 07c158f.

📒 Files selected for processing (17)

.github/CODEOWNERS (0 hunks)
.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (2 hunks)
CHANGELOG.rst (1 hunks)
CONTRIBUTING.md (1 hunks)
README.md (2 hunks)
docker/Dockerfile (0 hunks)
docker/README.md (0 hunks)
docker/build.sh (0 hunks)
docs/source/getting_started/_installation_for_Linux.rst (1 hunks)
examples/diffusers/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/llm_sparsity/data_prep.py (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
tests/examples/README.md (1 hunks)
tests/examples/llm_sparsity/test_llama_sparsify.py (0 hunks)

💤 Files with no reviewable changes (5)

docker/README.md
.github/CODEOWNERS
docker/build.sh
docker/Dockerfile
tests/examples/llm_sparsity/test_llama_sparsify.py

✅ Files skipped from review due to trivial changes (2)

CHANGELOG.rst
examples/onnx_ptq/README.md

🚧 Files skipped from review as they are similar to previous changes (8)

examples/llm_qat/README.md
CONTRIBUTING.md
examples/llm_ptq/README.md
tests/examples/README.md
examples/diffusers/README.md
.github/workflows/example_tests.yml
docs/source/getting_started/_installation_for_Linux.rst
README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs

🔇 Additional comments (6)

examples/llm_sparsity/data_prep.py (1)

60-60: Good fix: ensure JSON-serializable column.

Wrapping the datasets column with list(...) avoids pyarrow/Datasets column types leaking into json.dump. Looks correct.

Also applies to: 62-62

.gitlab/tests.yml (5)

3-4: LGTM! Environment setup looks correct.

The PIP_CONSTRAINT: "" variable properly disables pip constraints to allow package upgrades, which aligns with the PR objective of switching to TRT-LLM Docker images where unconstrained installs may be necessary.

28-43: Good consolidation of GPU test configuration.

The refactoring to .gpu-tests-default with proper library path setup and git-lfs installation is well-structured and consolidates common GPU test requirements.

44-49: LGTM! Clean inheritance pattern.

The multi-gpu job properly extends the new .gpu-tests-default base, maintaining consistency with the refactored structure.

67-74: Good approach to isolate failing tests.

The example-failing job with allow_failure: true for llm_qat tests is a pragmatic solution while the hanging issues are being investigated. The separate job prevents these failures from blocking the main pipeline.

78-78: Verify TensorRT-LLM image version availability.

Couldn't confirm automatically — your script's NGC API call returned non-JSON and docker manifest inspect failed (parse error; "Image not accessible or doesn't exist"). Action: confirm nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2 is published and pullable from NGC (log in to nvcr.io / use the NGC web UI or ngc CLI, then docker pull), or update the CI image to a known published tag (GitHub shows a v1.1.0rc2 release). (github.com)

ajrasane

LGTM. It would also be great if you could replace the default docker with the onnx ptq docker for ONNX unit and integration tests.

Signed-off-by: Keval Morabia <[email protected]>

README.md

docs/source/getting_started/_installation_for_Linux.rst

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (4)

docs/source/getting_started/_installation_for_Linux.rst (1)
43-46: Fix env vars: drop /usr/include and show aarch64 variant

Don’t add /usr/include to LD_LIBRARY_PATH.

Provide both x86_64 and aarch64 paths (this was requested previously).

Apply this diff:
-        export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # x86_64
+        export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # aarch64 (SBSA)
+        # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
.gitlab/tests.yml (3)
7-9: Rules force jobs to manual by default; make branch/MR pipelines auto

The unconditional “- when: manual” makes non-schedule, non-tag jobs manual. Use an explicit on_success default and a gated manual rule.
   rules:
-    - if: $CI_PIPELINE_SOURCE == "schedule"
-    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
-    - when: manual
+    - if: $CI_PIPELINE_SOURCE == "schedule"
+      when: on_success
+    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
+      when: on_success
+    - if: $CI_PIPELINE_SOURCE == "web"   # allow manual reruns from web UI
+      when: manual
+    - when: on_success                   # default for branches/MRs
59-60: Avoid unconstrained pip install ".[all,dev-test]" on vendor images

This can clobber preinstalled CUDA/TRT stacks.

Safer patterns:

Install example requirements first, then extras with constraints:
-    - pip install ".[all,dev-test]"
+    - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -r "$req_file" || exit 1; done
+    - pip install -c "$PIP_CONSTRAINT" ".[dev-test]"
If extras are tooling-only: pip install --no-deps ".[dev-test]".

Please confirm which path you prefer; I can prep a constraints file.

Also applies to: 93-95

3-4: Unconstrained upgrades increase breakage risk in vendor images

Setting PIP_CONSTRAINT to empty combined with later installs of ".[all,dev-test]" can upgrade torch/transformers inside TRT‑LLM/TensorRT images and cause CUDA/TRT ABI mismatches. Prefer pinned installs.

Option A (recommended): restore constraints per image baseline.
   variables:
-    PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
+    PIP_CONSTRAINT: "tests/constraints/trtllm-1.1.0rc2.txt"
Option B: when extras are tooling-only, use --no-deps and install per-example requirements first.

I can help derive a constraints file from your current images.

🧹 Nitpick comments (5)

examples/onnx_ptq/README.md (1)

27-27: Docs link change LGTM; minor cross-link nit

Consider also mentioning the specific NGC images (TensorRT-LLM and TensorRT) here to mirror the Linux install guide’s recommendations for ONNX/TensorRT workflows. Optional only.

docs/source/getting_started/_installation_for_Linux.rst (3)

33-35: Good shift to TRT-LLM image; add a concrete tag example

Add one explicit, known-good tag example (e.g., release:1.1.0rc2.post2) alongside the placeholder to reduce guesswork.

Also applies to: 37-41

46-46: Grammar nit

“examples's” → “example’s”.

79-81: Wording clarity

Tighten phrasing to: “You may still use other ModelOpt features locally (e.g., quantize a Hugging Face model) and use a docker image for deployment.”
.gitlab/tests.yml (1)
35-43: Fix TensorRT env exports: remove /usr/include and add aarch64 alt

Mirror the docs fix here to avoid polluting LD_LIBRARY_PATH and to support aarch64 runners.
-    - export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-    - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+    - export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+    - export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+    # For aarch64 runners:
+    # - export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+    # - export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07c158f and 7832b15.

📒 Files selected for processing (8)

.github/workflows/example_tests.yml (1 hunks)
.gitlab/tests.yml (2 hunks)
CHANGELOG.rst (1 hunks)
docs/source/getting_started/_installation_for_Linux.rst (2 hunks)
examples/llm_autodeploy/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
examples/onnx_ptq/docker/Dockerfile (0 hunks)
examples/onnx_ptq/docker/build.sh (0 hunks)

💤 Files with no reviewable changes (2)

examples/onnx_ptq/docker/Dockerfile
examples/onnx_ptq/docker/build.sh

✅ Files skipped from review due to trivial changes (2)

examples/llm_autodeploy/README.md
CHANGELOG.rst

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/example_tests.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs

examples/llm_ptq/README.md

examples/onnx_ptq/README.md

tests/examples/llm_sparsity/test_llama_sparsify.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

docs/source/getting_started/_installation_for_Linux.rst (1)

39-44: Fix LD_LIBRARY_PATH content and provide arch‑aware, container‑aware exports.

LD_LIBRARY_PATH must not include /usr/include.
Snippet is x86_64‑only; please include aarch64.
In official NGC containers these paths are usually set; call out that the exports are primarily for bare‑metal or tarball installs.

Apply:

-    You would also need to setup appropriate environment variables for the TensorRT binaries as follows:
+    If you installed TensorRT from a tarball on bare‑metal, set the environment variables for the binaries/libraries as follows.
+    In official NGC containers, these are typically preconfigured and this step can be skipped.
@@
-        export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # x86_64
+        if [ "$(uname -m)" = "x86_64" ]; then
+          export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+          export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        fi
+        # aarch64
+        if [ "$(uname -m)" = "aarch64" ]; then
+          export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+          export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"
+        fi

🧹 Nitpick comments (6)

docs/source/getting_started/_installation_for_Linux.rst (6)
46-46: Grammar: fix possessive.

Change “examples's” to “example’s” (or “examples’” if plural is intended).
-    You may need to install additional dependencies from the respective examples's `requirements.txt` file.
+    You may need to install additional dependencies from the respective example’s ``requirements.txt`` file.
48-48: Add NGC login tip for pulling containers.

Many users hit auth errors on nvcr.io. Add a one‑liner to login.
-    **Alternative NVIDIA docker images**
+    **Alternative NVIDIA docker images**
+
+    .. tip::
+        Authenticate to NGC before pulling images: ``docker login nvcr.io`` (requires an NVIDIA API key).
54-55: Avoid unqualified performance claim.

“provides superior performance” is broad and context‑dependent. Rephrase to reflect scope.
-    which provides superior performance to the PyTorch container.
+    which is optimized for TensorRT inference workflows and deployment.
79-80: Tighten phrasing and possessives.

Minor clarity/grammar improvements.
-    issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's
-    features locally for example, quantizing a HuggingFace model and then use a docker image for deployment.
+    issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt
+    features locally—for example, quantizing a HuggingFace model—then use a docker image for deployment.
88-91: Update text to reflect deprecation of the custom ModelOpt docker image.

This still refers to “build and use ModelOpt’s docker image,” which was deprecated in this PR.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all
-optional dependencies pre-installed.
+If you use the recommended NVIDIA containers above, you can skip this step when the image already contains ModelOpt and
+the optional dependencies you need.
10-26: Unpin system requirements; update TensorRT‑LLM and clarify PyTorch minimum

Verified: NGC TensorRT‑LLM latest tag is 1.1.0rc3 (modified Sep 4, 2025); NVIDIA TensorRT production branch containers are 25.03; NVIDIA PyTorch container builds vary (examples: 25.01 → PyTorch 2.6.x, 25.05/25.06 → 2.8); ModelOpt lists PyTorch (optional) >= 2.2 (not >= 2.6).

Edit docs/source/getting_started/_installation_for_Linux.rst (lines 10–26): replace hard pins with “tested with” language and link an authoritative compatibility matrix / NGC container tags.

Replace "TensorRT‑LLM (Optional) | 1.1.0rc2.post2" with either "tested with 1.1.x (NGC 1.1.0rc3)" or update the entry to the exact NGC tag 1.1.0rc3.

Replace "PyTorch | >=2.6" with wording that distinguishes minimum vs tested versions, e.g. "PyTorch | >=2.2 (tested with 2.6–2.8 in NVIDIA containers)".

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7832b15 and d39f268.

📒 Files selected for processing (2)

README.md (2 hunks)
docs/source/getting_started/_installation_for_Linux.rst (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: wait-checks / wait
GitHub Check: linux
GitHub Check: code-quality
GitHub Check: build-docs

🔇 Additional comments (1)

docs/source/getting_started/_installation_for_Linux.rst (1)
33-38: Add compatibility note about Model Optimizer in TRT‑LLM images

Preinstalled nvidia-modelopt in TRT‑LLM images may lag; if upgrading with pip ensure the upgraded ModelOpt is ABI/API-compatible with the image's TRT‑LLM/TensorRT versions. Reference: TRT‑LLM v0.21.0 bundles TensorRT 10.11, PyTorch 2.7.1, CUDA 12.9.0; PyPI nvidia-modelopt==0.35.0 (uploaded Sep 4, 2025).

Apply this minimal addition after Line 37:
+    .. note::
+        TRT‑LLM release images may contain an older ``nvidia-modelopt``. If you upgrade with ``pip``,
+        ensure the upgraded ModelOpt version is compatible with the TRT‑LLM/TensorRT versions in the image.
+        See the compatibility notes in the release docs.

Signed-off-by: Keval Morabia <[email protected]>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

examples/onnx_ptq/README.md (1)
27-30: Recommended base TensorRT image now listed — resolves prior ask.

This addresses the earlier request to name a base docker. Consider adding a short note that MXFP8/NVFP4 evaluation requires TensorRT >= 10.11 to help users pick a compatible tag.
-Please use the TensorRT docker image (e.g., `nvcr.io/nvidia/tensorrt:25.08-py3`) or visit our installation docs...
+Please use the TensorRT docker image (e.g., `nvcr.io/nvidia/tensorrt:25.08-py3`) or visit our installation docs...
+Note: For MXFP8 or NVFP4 evaluation in this README, choose a TensorRT container tag that includes TensorRT >= 10.11.
docs/source/getting_started/_installation_for_Linux.rst (2)
33-38: Good callout to upgrade ModelOpt in NGC images.

This addresses prior feedback that release images can ship older ModelOpt.

43-46: Fix LD_LIBRARY_PATH and add aarch64 variant.

Don’t include /usr/include in LD_LIBRARY_PATH.

Provide aarch64 paths (previous review asked for this).
-        export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
-        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # x86_64
+        export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # aarch64 (SBSA)
+        # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"

🧹 Nitpick comments (6)

examples/pruning/README.md (3)
26-26: Add an explicit “pip -U nvidia-modelopt” note for NeMo container users.

Containers may ship an older ModelOpt; mirror other docs by advising to upgrade inside the container.
-For Minitron pruning for Megatron-LM / NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.07`) which has all the dependencies installed.
+For Minitron pruning for Megatron-LM / NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.07`) which has all the dependencies installed. After launching the container, upgrade Model Optimizer:
+```bash
+pip install -U nvidia-modelopt
+```
30-30: Typo: “requisred” → “required”.
-For GradNAS pruning for Hugging Face BERT / GPT-J, no additional dependencies are requisred.
+For GradNAS pruning for Hugging Face BERT / GPT-J, no additional dependencies are required.
115-116: Grammar fixes in FastNAS section.
-also how to profiling the model to understand the search space of possible pruning options and demonstrates
-the usage saving and restoring pruned models.
+also how to profile the model to understand the search space of possible pruning options and demonstrates
+the usage of saving and restoring pruned models.
examples/speculative_decoding/README.md (1)
46-47: Avoid root requirement and system‑wide Git LFS install.

Many users won’t have root in shared images. Prefer user‑level install; drop “--system”.
-apt-get update && apt-get install -y git-lfs
-git lfs install --system
+# If you have root:
+sudo apt-get update && sudo apt-get install -y git-lfs
+# User-level install (recommended):
+git lfs install
docs/source/getting_started/_installation_for_Linux.rst (2)
88-90: Remove reference to building ModelOpt’s custom docker image.

This PR deprecates the custom Dockerfile; align wording to recommended NGC images.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all
-optional dependencies pre-installed.
-If you use other suggested docker images, ModelOpt is pre-installed with some of the below optional dependencies.
+If you use the recommended NVIDIA docker images (TensorRT‑LLM, PyTorch, or NeMo), ModelOpt may already be pre‑installed with some optional dependencies.
 Make sure to upgrade to the latest version of ModelOpt (with appropriate optional dependencies you need) using pip as shown below.
21-24: Version pin: update TensorRT-LLM to 1.1.0rc3 or loosen to >=1.1.0rc3

docs/source/getting_started/_installation_for_Linux.rst (lines 21–24) currently pins "TensorRT-LLM (Optional) 1.1.0rc2.post2"; latest NGC TensorRT-LLM is 1.1.0rc3 and recommended NGC base image tag is 25.03 — either change the pin to "1.1.0rc3" or use ">=1.1.0rc3" and document the NGC tag 25.03.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d39f268 and 3392955.

📒 Files selected for processing (9)

README.md (2 hunks)
docs/source/getting_started/_installation_for_Linux.rst (2 hunks)
examples/diffusers/README.md (2 hunks)
examples/llm_distill/README.md (1 hunks)
examples/llm_ptq/README.md (1 hunks)
examples/llm_qat/README.md (1 hunks)
examples/onnx_ptq/README.md (1 hunks)
examples/pruning/README.md (1 hunks)
examples/speculative_decoding/README.md (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

README.md
examples/llm_qat/README.md
examples/diffusers/README.md
examples/llm_ptq/README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs

🔇 Additional comments (5)

examples/llm_distill/README.md (2)

24-31: Docker prereqs look good.

Clear guidance for PyTorch/NeMo images plus link to install docs.

32-38: Install command update LGTM.

Switch to pip install -U nvidia-modelopt[hf] is appropriate.

examples/onnx_ptq/README.md (1)

36-36: Install command update LGTM.

examples/speculative_decoding/README.md (2)

28-33: Docker prereqs + upgrade note LGTM.

39-40: Local install command LGTM.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

.gitlab/tests.yml (3)

80-96: Avoid unconstrained ".[all,dev-test]" on TRT base; add constraints and keep Apex uninstall.

Repeat of prior concern; add constraints and guard path appends.
-    - pip install ".[all,dev-test]"
+    - pip install -c ${PIP_CONSTRAINT:-tests/constraints/trt-25.08.txt} ".[dev-test]"
+    # Install example deps under constraints to minimize upgrades
+    - find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -c ${PIP_CONSTRAINT:-tests/constraints/trt-25.08.txt} -r "$req_file" || exit 1; done
8-10: Rules make pipelines manual by default; fix ordering/default.

This was flagged earlier. The unconditional - when: manual forces branch/MR jobs to manual.

Apply:
 .tests-default:
   rules:
-    - if: $CI_PIPELINE_SOURCE == "schedule"
-    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
-    - when: manual
+    - if: $CI_PIPELINE_SOURCE == "schedule"
+      when: on_success
+    - if: $CI_COMMIT_TAG =~ /^\d+\.\d+\.\d+$/
+      when: on_success
+    - if: $CI_PIPELINE_SOURCE == "web"
+      when: manual
+    - when: on_success
1-4: Reconsider globally disabling constraints.

PIP_CONSTRAINT: "" encourages unconstrained upgrades in CI, reducing reproducibility and risking CUDA/TRT breakage on vendor images. Prefer per‑job opt‑out with justification, or switch to an image‑aligned constraints file.
-.tests-default:
-  variables:
-    PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
+.tests-default:
+  variables:
+    PIP_CONSTRAINT: "tests/constraints/trtllm-25.06.txt"  # default; override per-job if needed

docs/source/getting_started/_installation_for_Linux.rst (1)

39-45: Fix LD_LIBRARY_PATH contents and provide aarch64 variant.

Remove /usr/include (headers, not libs).
Provide ARM64 paths or mark snippet x86_64‑only. This was raised previously.

Suggested replacement:

-        export PIP_CONSTRAINT=""
-        export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib"
-        export PATH="${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin"
+        export PIP_CONSTRAINT=""
+        # x86_64
+        export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+        # aarch64 (ARM64/SBSA)
+        # export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+        # export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"

🧹 Nitpick comments (7)

.github/workflows/gpu_tests.yml (1)
73-76: Remove /usr/include from LD_LIBRARY_PATH and guard paths.

/usr/include is a headers directory, not a library path. Also guard appends with -d to avoid polluting envs when TRT isn’t present.

Apply:
-echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" >> $GITHUB_ENV
-echo "PATH=${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" >> $GITHUB_ENV
+{ test -d /usr/lib/x86_64-linux-gnu && echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib/x86_64-linux-gnu" >> "$GITHUB_ENV"; } || true
+{ test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/lib && echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" >> "$GITHUB_ENV"; } || true
+{ test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/bin && echo "PATH=${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin" >> "$GITHUB_ENV"; } || true
docs/source/getting_started/_installation_for_Linux.rst (4)
47-47: Minor grammar tweak.

“examples's” → “example’s” or “examples’”.

55-57: Qualify “superior performance” claim or add context.

State the scope (TensorRT inference workloads) and, if possible, link to a reference/benchmark; otherwise soften to “typically better suited”.

80-81: Wording + brand name.

“issues, we recommend using a docker image” → “if issues arise, prefer an NVIDIA container”.

“HuggingFace” → “Hugging Face”.

89-90: Remove stale reference to building ModelOpt docker image.

Custom ModelOpt Dockerfile/images are deprecated in this PR; adjust this sentence accordingly.
-If you build and use ModelOpt's docker image, you can skip this step as the image already contains ModelOpt and all
-optional dependencies pre-installed.
+If you use the recommended NVIDIA containers, some may include ModelOpt pre‑installed; otherwise install via pip below.
.gitlab/tests.yml (2)
35-43: Fix LD_LIBRARY_PATH (drop /usr/include) and add aarch64 note; guard paths.

Same concern as GitHub workflow and docs; also make exports conditional.
-# Add libcudnn*.so and libnv*.so to path
-- export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib"
+# Add common library paths (x86_64); guard if present
+- test -d /usr/lib/x86_64-linux-gnu && export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}"
+- test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/lib && export LD_LIBRARY_PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
+# aarch64 example (uncomment if using ARM runners)
+# test -d /usr/lib/aarch64-linux-gnu && export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:${LD_LIBRARY_PATH}"
-# Add trtexec to path
-- export PATH="${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin"
+# Add trtexec to PATH if present
+test -d /usr/local/tensorrt/targets/x86_64-linux-gnu/bin && export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
59-59: Use constraints to avoid clobbering vendor stacks.

Unconstrained ".[hf,dev-test]" can upgrade torch/transformers. Prefer constraints or --no-deps if extras are tooling‑only.
-    - pip install ".[hf,dev-test]"
+    - pip install -c ${PIP_CONSTRAINT:-tests/constraints/trtllm-25.06.txt} ".[hf,dev-test]"
+# or, if safe:
+#   - pip install --no-deps ".[dev-test]"
+#   - pip install -c ${PIP_CONSTRAINT} -r examples/$EXAMPLE/requirements.txt

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3392955 and d22bdab.

📒 Files selected for processing (4)

.github/workflows/example_tests.yml (1 hunks)
.github/workflows/gpu_tests.yml (1 hunks)
.gitlab/tests.yml (2 hunks)
docs/source/getting_started/_installation_for_Linux.rst (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/example_tests.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs
GitHub Check: wait-checks / wait

🔇 Additional comments (8)

docs/source/getting_started/_installation_for_Linux.rst (2)

33-36: Good call on recommending TRT‑LLM release images.

Consider adding a concrete version example (pin) and a brief note on matching the image to your CUDA/driver.

49-49: Alt images section reads well.

Clear options and links.

.gitlab/tests.yml (6)

28-28: Good split: multi‑GPU default anchor.

Makes downstream jobs cleaner.

44-49: LGTM: tox‑current‑env for GPU tests.

Reuses image stack and avoids creating extra envs.

52-54: Example job factoring looks good.

57-57: Matrix reduction is reasonable.

63-70: Explicitly marking flaky QAT as allow_failure is fine.

Consider adding an issue link for traceability.

71-79: Pin looks good; verify alignment with “System requirements” table.

Image tag tensorrt-llm/release:1.1.0rc2.post2 should match doc requirements; otherwise CI/doc drift.

docs/source/getting_started/_installation_for_Linux.rst

Signed-off-by: Keval Morabia <[email protected]>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (4)

docs/source/getting_started/_installation_for_Linux.rst (4)

33-35: Call out that TRT-LLM images can ship older ModelOpt; show check/upgrade path.

Add a short note so users verify and (carefully) upgrade the preinstalled ModelOpt in TRT‑LLM images.

Apply this diff:

 To use Model Optimizer with full dependencies (e.g. TensorRT/TensorRT-LLM deployment), we recommend using the
 `TensorRT-LLM docker image <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags>`_,
 e.g., ``nvcr.io/nvidia/tensorrt-llm/release:<version>``.
+
+.. note::
+   These images may include an older ``nvidia-modelopt`` than PyPI. Check the version with::
+
+      pip show nvidia-modelopt | sed -n 's/^Version: //p'
+
+   If you upgrade inside the container, follow the constraints guidance below to avoid breaking core CUDA/TRT/PyTorch packages.

Run to confirm we mention this elsewhere:

#!/bin/bash
rg -nC2 -i 'constraints|compatibility matrix|modelopt.*pre-?installed' docs/

49-56: Clarify preinstalled versions; soften performance claim for TRT container.

Note that preinstalled ModelOpt can lag; and avoid absolute “superior performance” wording.

Apply this diff:

 For PyTorch, you can also use `NVIDIA NGC PyTorch container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags>`_
 and for NVIDIA NeMo framework, you can use the `NeMo container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags>`_.
 Both of these containers come with Model Optimizer pre-installed. Make sure to update the Model Optimizer to the latest version if not already.
 
-    For ONNX / TensorRT use cases, you can also use the `TensorRT container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_
-    which provides superior performance to the PyTorch container.
+    .. note::
+       The preinstalled Model Optimizer version in these containers may be older than the latest release. Verify the version and
+       upgrade using a constraints file as described above.
+
+    For ONNX / TensorRT use cases, you can also use the `TensorRT container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_,
+    which is optimized for TensorRT workloads and is recommended for TRT/ONNX deployment scenarios.

37-37: Avoid unconstrained pip upgrades inside vendor images; recommend constraints.

Unqualified -U risks breaking CUDA/TRT/PyTorch. Add a warning and a constrained install example.

Apply this diff:

-    Make sure to upgrade Model Optimizer to the latest version using ``pip`` as described in the next section.
+    .. warning::
+       When upgrading inside NVIDIA containers, use a constraints file aligned to the image stack to avoid upgrading
+       core libraries (``torch``, ``transformers``, CUDA/TRT bindings).
+       Example::
+
+         pip install -U -c /path/to/constraints.txt "nvidia-modelopt[all]"
+
+       See the compatibility/constraints documentation for your chosen image.

41-46: Fix env vars: remove /usr/include from LD_LIBRARY_PATH and provide aarch64 variant.

“/usr/include” does not belong on LD_LIBRARY_PATH; and we need aarch64 paths per requirements.

Apply this diff:

-    .. code-block:: shell
-
-        export PIP_CONSTRAINT=""
-        export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib"
-        export PATH="${PATH}:/usr/local/tensorrt/targets/x86_64-linux-gnu/bin"
+    .. tabs::
+
+       .. tab:: x86_64
+
+          .. code-block:: shell
+
+              export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
+              export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
+
+       .. tab:: aarch64 (SBSA)
+
+          .. code-block:: shell
+
+              export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu:/usr/local/tensorrt/targets/aarch64-linux-gnu/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
+              export PATH="/usr/local/tensorrt/targets/aarch64-linux-gnu/bin:${PATH}"

🧹 Nitpick comments (4)

examples/diffusers/cache_diffusion/requirements.txt (2)
1-1: Add a lower bound (or drop the pin in container flows) to avoid mismatched CUDA bindings.

cuda-python<13 protects against a future major, but pip may pick an older 11.x in non-container envs, or fight with preinstalled bindings in TRT-LLM/TRT/PyTorch images. Prefer >=12,<13 (or the minimum you test, e.g., >=12.2,<13). If examples are run only inside NVIDIA base images, consider omitting cuda-python here to avoid downgrades.

Apply:
-cuda-python<13
+cuda-python>=12,<13
2-5: Centralize shared pins to prevent drift across examples.

OpenCV, PEFT, Polygraphy, and SentencePiece pins recur in diffusers examples. Move common versions into a top-level constraints.txt and reference via -c constraints.txt from per-example requirements to keep them in sync.

Would you like a follow-up PR patch that introduces constraints.txt and rewires examples to use it?
docs/source/getting_started/_installation_for_Linux.rst (2)
47-47: Grammar + add actionable instruction.

Tweak possessive and show how to install extras.

Apply this diff:
-    You may need to install additional dependencies from the respective examples's `requirements.txt` file.
+    You may need to install additional dependencies from the respective examples' ``requirements.txt`` file, e.g.::
+
+      pip install -r examples/<example>/requirements.txt
80-81: Tighten wording (clarity + brand).

Minor phrasing edits; “Hugging Face” spelling.

Apply this diff:
-    issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's
-    features locally for example, quantizing a HuggingFace model and then use a docker image for deployment.
+    issues, we recommend using a Docker image for a seamless experience. You may still choose to use other ModelOpt
+    features locally—for example, quantizing a Hugging Face model—and then use a Docker image for deployment.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d22bdab and 54b6bcf.

📒 Files selected for processing (6)

.github/workflows/example_tests.yml (1 hunks)
.github/workflows/gpu_tests.yml (1 hunks)
.gitlab/tests.yml (2 hunks)
docs/source/getting_started/_installation_for_Linux.rst (2 hunks)
examples/diffusers/cache_diffusion/requirements.txt (1 hunks)
examples/diffusers/quantization/requirements.txt (1 hunks)

✅ Files skipped from review due to trivial changes (1)

examples/diffusers/quantization/requirements.txt

🚧 Files skipped from review as they are similar to previous changes (3)

.github/workflows/example_tests.yml
.github/workflows/gpu_tests.yml
.gitlab/tests.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (2)

examples/diffusers/cache_diffusion/requirements.txt (1)

1-1: Confirmed: cuda-python constraint consistent across examples.
Both examples/diffusers/cache_diffusion/requirements.txt and examples/diffusers/quantization/requirements.txt use cuda-python<13.
docs/source/getting_started/_installation_for_Linux.rst (1)
33-47: Remove export of an empty PIP_CONSTRAINT — add constraints guidance next to the pip install step

Replace the exported empty env var and add this tip in docs/source/getting_started/_installation_for_Linux.rst (around lines 33–47):
.. tip::
   If you are inside an NVIDIA container, install with a constraints file to preserve the container’s stack, e.g.::

      pip install -U -c /opt/nvidia/constraints/<stack>/constraints.txt "nvidia-modelopt[all]"
No constraints files were found in the repo; confirm whether the repo ships constraints and update the example path or remove the tip accordingly.

kevalmorabia97 requested review from ChenhanYu and cjluo-nv September 19, 2025 06:53

kevalmorabia97 requested review from a team as code owners September 19, 2025 06:53

kevalmorabia97 requested review from ajrasane and Edwardf0t1 September 19, 2025 06:53

kevalmorabia97 requested a review from i-riyad September 19, 2025 06:53

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

docs/source/getting_started/_installation_for_Linux.rst Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from fabbe02 to 70c4652 Compare September 19, 2025 07:04

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from 70c4652 to 613df8d Compare September 19, 2025 08:35

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

.github/workflows/example_tests.yml Outdated Show resolved Hide resolved

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from 613df8d to 39b57bc Compare September 19, 2025 09:22

kevalmorabia97 requested a review from realAsma September 19, 2025 09:25

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch 2 times, most recently from aa9108b to e8b201f Compare September 19, 2025 09:32

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

.gitlab/tests.yml Show resolved Hide resolved

.gitlab/tests.yml Outdated Show resolved Hide resolved

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from e8b201f to 7fd1f74 Compare September 19, 2025 09:49

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from 7fd1f74 to 41d36ff Compare September 19, 2025 11:40

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from 41d36ff to 96b897f Compare September 19, 2025 12:33

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

kevalmorabia97 requested a review from a team as a code owner September 19, 2025 13:12

kevalmorabia97 changed the title ~~Deprecate ModelOpt custom docker and directly use TRT-LLM docker~~ Deprecate ModelOpt custom docker and directly use TRT-LLM / DLFW docker Sep 19, 2025

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

ajrasane approved these changes Sep 19, 2025

View reviewed changes

Deprecate onnx ptq docker image as well

7832b15

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 requested a review from a team as a code owner September 19, 2025 17:20

kevalmorabia97 requested a review from RalphMao September 19, 2025 17:23

cjluo-nv reviewed Sep 19, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

cjluo-nv reviewed Sep 19, 2025

View reviewed changes

docs/source/getting_started/_installation_for_Linux.rst Outdated Show resolved Hide resolved

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

cjluo-nv reviewed Sep 19, 2025

View reviewed changes

examples/llm_ptq/README.md Outdated Show resolved Hide resolved

cjluo-nv reviewed Sep 19, 2025

View reviewed changes

examples/onnx_ptq/README.md Outdated Show resolved Hide resolved

cjluo-nv reviewed Sep 19, 2025

View reviewed changes

tests/examples/llm_sparsity/test_llama_sparsify.py Show resolved Hide resolved

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

Minor Doc updates

3392955

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from d39f268 to 3392955 Compare September 19, 2025 17:57

kevalmorabia97 requested review from a team as code owners September 19, 2025 17:57

kevalmorabia97 changed the title ~~Deprecate ModelOpt custom docker and directly use TRT-LLM / DLFW docker~~ Deprecate ModelOpt custom docker and directly use TRT-LLM / PyTorch / TRT docker Sep 19, 2025

ChenhanYu approved these changes Sep 19, 2025

View reviewed changes

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

i-riyad approved these changes Sep 19, 2025

View reviewed changes

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from aa730a1 to d22bdab Compare September 19, 2025 18:55

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

docs/source/getting_started/_installation_for_Linux.rst Show resolved Hide resolved

Fix LD_LIBRARY_PATH for onnx docker

54b6bcf

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 force-pushed the kmorabia/remove-dockerfile branch from d22bdab to 54b6bcf Compare September 19, 2025 19:21

kevalmorabia97 requested a review from jingyu-ml September 19, 2025 19:25

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

realAsma approved these changes Sep 19, 2025

View reviewed changes

Deprecate ModelOpt custom docker and directly use TRT-LLM / PyTorch / TRT docker #346

Are you sure you want to change the base?

Deprecate ModelOpt custom docker and directly use TRT-LLM / PyTorch / TRT docker #346

Conversation

kevalmorabia97 commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ajrasane left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kevalmorabia97 commented Sep 19, 2025 •

edited

Loading

coderabbitai bot commented Sep 19, 2025 •

edited

Loading

codecov bot commented Sep 19, 2025 •

edited

Loading

ajrasane left a comment •

edited

Loading