Skip to content

Commit fabbe02

Browse files
Deprecate ModelOpt custom docker and directly use TRT-LLM docker
Signed-off-by: Keval Morabia <[email protected]>
1 parent 4c36abe commit fabbe02

File tree

14 files changed

+46
-112
lines changed

14 files changed

+46
-112
lines changed

.github/CODEOWNERS

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ modelopt/torch/trace @NVIDIA/modelopt-torch-nas-prune-codeowners
3030
modelopt/torch/utils @NVIDIA/modelopt-torch-utils-codeowners
3131

3232
# Examples
33-
/docker @NVIDIA/modelopt-docker-codeowners
3433
/README.md @NVIDIA/modelopt-examples-codeowners
3534
/examples @NVIDIA/modelopt-examples-codeowners
3635
/examples/chained_optimizations @NVIDIA/modelopt-torch-nas-prune-codeowners

.gitlab/tests.yml

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
# NOTE: Make sure this file is consistent with .github/workflows/{unit,gpu}_tests.yml
1+
# NOTE: Make sure this file is consistent with .github/workflows/{unit,gpu,example}_tests.yml
22
.tests-default:
3+
variables:
4+
PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
35
stage: tests
46
rules:
57
- if: $CI_PIPELINE_SOURCE == "schedule"
6-
when: always
7-
- if: $CI_PIPELINE_SOURCE != "schedule"
8-
when: manual
8+
- when: manual
99

1010
##### Unit Tests #####
1111
unit:
@@ -24,14 +24,13 @@ unit:
2424
- tox -e py3$PYTHON-torch$TORCH-tf_$TRANSFORMERS-unit
2525

2626
##### GPU Tests #####
27-
gpu:
27+
multi-gpu:
2828
extends: .tests-default
29-
timeout: 60m
29+
timeout: 90m
3030
image: nvcr.io/nvidia/pytorch:25.06-py3
3131
variables:
3232
GIT_DEPTH: 1000 # For correct version for tests/gpu/torch/quantization/plugins/test_megatron.py
33-
LD_LIBRARY_PATH: "/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}" # Add libcudnn*.so and libnv*.so to path.
34-
PIP_CONSTRAINT: "" # Disable pip constraint for upgrading packages
33+
LD_LIBRARY_PATH: "/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}" # Add libcudnn*.so and libnv*.so to path
3534
tags: [docker, linux, 2-gpu]
3635
script:
3736
# Use pre-installed packages without a new venv with tox-current-env
@@ -42,15 +41,23 @@ gpu:
4241
example:
4342
extends: .tests-default
4443
stage: tests
45-
timeout: 45m
46-
image: gitlab-master.nvidia.com:5005/omniml/modelopt/modelopt_examples:latest
44+
timeout: 30m
45+
image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
4746
variables:
47+
LD_LIBRARY_PATH: "/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" # Add libcudnn*.so and libnv*.so to path
48+
PATH: "/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" # Add trtexec to path
4849
TEST_TYPE: pytest
50+
ALLOWED_FAILURES: ${EXAMPLE_ALLOWED_FAILURES:llm_qat} # comma separated list of examples to allow failure
4951
tags: [docker, linux, 2-gpu, sm<89]
5052
parallel:
5153
matrix:
5254
- EXAMPLE: [diffusers, llm_distill, llm_qat, llm_sparsity, onnx_ptq, speculative_decoding]
53-
allow_failure: true # Allow to continue next stages even if job is canceled (e.g. during release)
55+
rules:
56+
- if: $CI_PIPELINE_SOURCE == "schedule" && $ALLOWED_FAILURES =~ /\b${EXAMPLE}\b/
57+
allow_failure: true
58+
- if: $CI_PIPELINE_SOURCE == "schedule"
59+
allow_failure: false
60+
- when: manual
5461
before_script:
5562
- pip install ".[all,dev-test]"
5663
script:

CHANGELOG.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Model Optimizer Changelog (Linux)
66

77
**Deprecations**
88

9+
- Deprecated ModelOpt's custom docker image. Please use the TensorRT-LLM docker image directly or refer to the [installation guide](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more details.
910
- Deprecated ``quantize_mode`` argument in ``examples/onnx_ptq/evaluate.py`` to support strongly typing. Use ``engine_precision`` instead.
1011
- Deprecated TRT-LLM's TRT backend in ``examples/llm_ptq`` and ``examples/vlm_ptq``. Tasks ``build`` and ``benchmark`` support are removed and replaced with ``quant``. For performance evaluation, please use ``trtllm-bench`` directly.
1112
- ``--export_fmt`` flag in ``examples/llm_ptq`` is removed. By default we export to the unified Hugging Face checkpoint format.

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ pip install -e ".[dev]"
1111
```
1212

1313
If you are working on features that require dependencies like TensorRT-LLM or Megatron-Core, consider using a docker container to simplify the setup process.
14-
See [docker README](./README.md#installation--docker) for more details.
14+
Visit our [installation docs](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more information.
1515

1616
## 🧹 Code linting and formatting
1717

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,10 +61,10 @@ Model Optimizer is also integrated with [NVIDIA NeMo](https://github.com/NVIDIA-
6161
To install stable release packages for Model Optimizer with `pip` from [PyPI](https://pypi.org/project/nvidia-modelopt/):
6262

6363
```bash
64-
pip install nvidia-modelopt[all]
64+
pip install -U nvidia-modelopt[all]
6565
```
6666

67-
To install from source in editable mode with all development dependencies or to test the latest changes, run:
67+
To install from source in editable mode with all development dependencies or to use the latest features, run:
6868

6969
```bash
7070
# Clone the Model Optimizer repository
@@ -74,7 +74,9 @@ cd TensorRT-Model-Optimizer
7474
pip install -e .[dev]
7575
```
7676

77-
Visit our [installation guide](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more fine-grained control on installed dependencies or view our pre-made [dockerfiles](docker/README.md) for more information.
77+
You can also directly use the [TensorRT-LLM docker images](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags)
78+
(e.g., `nvcr.io/nvidia/tensorrt-llm/release:<version>`),
79+
which have Model Optimizer pre-installed. Visit our [installation guide](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more fine-grained control on installed dependencies or for alternative docker images and environment variables to setup.
7880

7981
## Techniques
8082

docker/Dockerfile

Lines changed: 0 additions & 27 deletions
This file was deleted.

docker/README.md

Lines changed: 0 additions & 16 deletions
This file was deleted.

docker/build.sh

Lines changed: 0 additions & 19 deletions
This file was deleted.

docs/source/getting_started/_installation_for_Linux.rst

Lines changed: 11 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -30,39 +30,29 @@ Environment setup
3030

3131
.. tab:: Docker image (Recommended)
3232

33-
**Using ModelOpt's docker image**
33+
To use Model Optimizer with full dependencies (e.g. TensorRT/TensorRT-LLM deployment), we recommend using the
34+
`TensorRT-LLM docker image <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags>`_,
35+
e.g., ``nvcr.io/nvidia/tensorrt-llm/release:<version>``.
3436

35-
To use Model Optimizer with full dependencies (e.g. TensorRT/TensorRT-LLM deployment), we recommend using our provided docker image
36-
which is based on the `TensorRT-LLM <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags>`_
37-
docker image with additional dependencies installed.
37+
You may upgrade the Model Optimizer to the latest version if not already as described in the next section.
3838

39-
After installing the `NVIDIA Container Toolkit <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>`_,
40-
please run the following commands to build the Model Optimizer docker container which has all the base
41-
dependencies pre-installed. You may need to install additional dependencies from the examples's `requirements.txt` file.
39+
You would also need to setup appropriate environment variables for the TensorRT binaries as follows:
4240

4341
.. code-block:: shell
4442
45-
# Clone the ModelOpt repository
46-
git clone [email protected]:NVIDIA/TensorRT-Model-Optimizer.git
47-
cd TensorRT-Model-Optimizer
43+
export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
44+
export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
4845
49-
# Build the docker (will be tagged `docker.io/library/modelopt_examples:latest`)
50-
# You may customize `docker/Dockerfile` to include or exclude certain dependencies you may or may not need.
51-
bash docker/build.sh
46+
You may need to install additional dependencies from the respective examples's `requirements.txt` file.
5247

53-
# Run the docker image
54-
docker run --gpus all -it --shm-size 20g --rm docker.io/library/modelopt_examples:latest bash
55-
56-
# Check installation (inside the docker container)
57-
python -c "import modelopt; print(modelopt.__version__)"
58-
59-
**Using alternative NVIDIA docker images**
48+
**Alternative NVIDIA docker images**
6049

6150
For PyTorch, you can also use `NVIDIA NGC PyTorch container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags>`_
6251
and for NVIDIA NeMo framework, you can use the `NeMo container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags>`_.
6352
Both of these containers come with Model Optimizer pre-installed. Make sure to update the Model Optimizer to the latest version if not already.
6453

65-
For ONNX PTQ, you can use the optimized docker image from [onnx_ptq Dockerfile](https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/onnx_ptq/docker).
54+
For ONNX PTQ, you can use the docker image from `onnx_ptq Dockerfile <https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/onnx_ptq/docker>`_
55+
which includes the latest publicly available TensorRT version, providing access to cutting-edge features and superior performance.
6656

6757
.. tab:: Local environment (PIP / Conda)
6858

examples/diffusers/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ Each subsection (cache_diffusion, quantization, etc.) have their own `requiremen
3737

3838
You can find the latest TensorRT [here](https://developer.nvidia.com/tensorrt/download).
3939

40-
Visit our [installation guide](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) or view our pre-made [dockerfiles](../../docker/Dockerfile) for more information.
40+
Visit our [installation docs](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more information.
4141

4242
## Getting Started
4343

0 commit comments

Comments
 (0)