Skip to content

Commit 7832b15

Browse files
Deprecate onnx ptq docker image as well
Signed-off-by: Keval Morabia <[email protected]>
1 parent 07c158f commit 7832b15

File tree

8 files changed

+36
-203
lines changed

8 files changed

+36
-203
lines changed

.github/workflows/example_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ jobs:
7474
- uses: nv-gha-runners/setup-proxy-cache@main
7575
- name: Setup environment variables
7676
run: |
77-
echo "LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
77+
echo "LD_LIBRARY_PATH=/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
7878
echo "PATH=/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" >> $GITHUB_ENV
7979
- name: Run example tests
8080
run: |

.gitlab/tests.yml

Lines changed: 27 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ unit:
2525
- tox -e py3$PYTHON-torch$TORCH-tf_$TRANSFORMERS-unit
2626

2727
##### GPU Tests #####
28-
.gpu-tests-default:
28+
.multi-gpu-tests-default:
2929
extends: .tests-default
3030
timeout: 60m
3131
image: nvcr.io/nvidia/pytorch:25.06-py3
@@ -34,54 +34,65 @@ unit:
3434
tags: [docker, linux, 2-gpu]
3535
before_script:
3636
# Add libcudnn*.so and libnv*.so to path
37-
- export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
37+
- export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
3838
# Add trtexec to path
39-
- export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:$PATH"
39+
- export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
4040
# Install git-lfs for Daring-Anteater dataset
4141
- apt-get update && apt-get install -y git-lfs
4242
- git lfs install --system
4343

4444
multi-gpu:
45-
extends: .gpu-tests-default
45+
extends: .multi-gpu-tests-default
4646
script:
4747
# Use pre-installed packages without a new venv with tox-current-env
4848
- pip install tox-current-env
4949
- tox -e py312-cuda12-gpu --current-env
5050

5151
##### Example Tests #####
52-
example:
53-
extends: .gpu-tests-default
52+
example-torch:
53+
extends: .multi-gpu-tests-default
5454
timeout: 30m
55-
variables:
56-
TEST_TYPE: pytest
5755
parallel:
5856
matrix:
59-
- EXAMPLE: [diffusers, llm_distill, llm_sparsity, onnx_ptq, speculative_decoding]
57+
- EXAMPLE: [llm_distill, llm_sparsity, speculative_decoding]
6058
script:
6159
- pip install ".[all,dev-test]"
62-
# Uninstall apex since T5 Int8 (PixArt) + Apex is not supported as per https://github.com/huggingface/transformers/issues/21391
63-
- if [ "$EXAMPLE" = "diffusers" ]; then pip uninstall -y apex; fi
6460
- find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -r "$req_file" || exit 1; done
65-
- if [ "$TEST_TYPE" = "pytest" ]; then pytest -s tests/examples/$EXAMPLE; else bash tests/examples/test_$EXAMPLE.sh; fi
61+
- pytest -s tests/examples/$EXAMPLE
6662

6763
# TODO: Fix llm_qat test hang in GitLab CI
6864
example-failing:
69-
extends: example
65+
extends: example-torch
7066
allow_failure: true
7167
parallel:
7268
matrix:
7369
- EXAMPLE: [llm_qat]
7470

75-
example-ada:
76-
extends: example
71+
example-trtllm:
72+
extends: example-torch
7773
timeout: 60m
7874
image: nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2
7975
tags: [docker, linux, 2-gpu, sm>=89]
8076
parallel:
8177
matrix:
82-
- EXAMPLE: [llm_eval, llm_ptq, vlm_ptq, llm_autodeploy]
78+
- EXAMPLE: [llm_autodeploy, llm_eval, llm_ptq, vlm_ptq]
79+
80+
example-onnx:
81+
extends: example-torch
82+
image: nvcr.io/nvidia/tensorrt:25.08-py3
83+
tags: [docker, linux, 2-gpu, sm>=89]
84+
parallel:
85+
matrix:
86+
- EXAMPLE: [diffusers, onnx_ptq]
87+
TEST_TYPE: pytest
8388
- EXAMPLE: [onnx_ptq]
8489
TEST_TYPE: bash
90+
script:
91+
# Uninstall apex since T5 Int8 (PixArt) + Apex is not supported as per https://github.com/huggingface/transformers/issues/21391
92+
- if [ "$EXAMPLE" = "diffusers" ]; then pip uninstall -y apex; fi
93+
- pip install ".[all,dev-test]"
94+
- find examples/$EXAMPLE -name "requirements.txt" | while read req_file; do pip install -r "$req_file" || exit 1; done
95+
- if [ "$TEST_TYPE" = "pytest" ]; then pytest -s tests/examples/$EXAMPLE; else bash tests/examples/test_$EXAMPLE.sh; fi
8596

8697
##### Megatron / NeMo Integration Tests #####
8798
megatron-nemo-integration:

CHANGELOG.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Model Optimizer Changelog (Linux)
66

77
**Deprecations**
88

9-
- Deprecated ModelOpt's custom docker image. Please use the TensorRT-LLM docker image directly or refer to the `installation guide <https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html>`_ for more details.
9+
- Deprecated ModelOpt's custom docker images. Please use the PyTorch, TensorRT-LLM or TensorRT docker image directly or refer to the `installation guide <https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html>`_ for more details.
1010
- Deprecated ``quantize_mode`` argument in ``examples/onnx_ptq/evaluate.py`` to support strongly typing. Use ``engine_precision`` instead.
1111
- Deprecated TRT-LLM's TRT backend in ``examples/llm_ptq`` and ``examples/vlm_ptq``. Tasks ``build`` and ``benchmark`` support are removed and replaced with ``quant``. For performance evaluation, please use ``trtllm-bench`` directly.
1212
- ``--export_fmt`` flag in ``examples/llm_ptq`` is removed. By default we export to the unified Hugging Face checkpoint format.

docs/source/getting_started/_installation_for_Linux.rst

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Environment setup
4040

4141
.. code-block:: shell
4242
43-
export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
43+
export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}"
4444
export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}"
4545
4646
You may need to install additional dependencies from the respective examples's `requirements.txt` file.
@@ -51,8 +51,8 @@ Environment setup
5151
and for NVIDIA NeMo framework, you can use the `NeMo container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags>`_.
5252
Both of these containers come with Model Optimizer pre-installed. Make sure to update the Model Optimizer to the latest version if not already.
5353

54-
For ONNX PTQ, you can use the docker image from `onnx_ptq Dockerfile <https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/onnx_ptq/docker>`_
55-
which includes the latest publicly available TensorRT version, providing access to cutting-edge features and superior performance.
54+
For ONNX / TensorRT use cases, you can also use the `TensorRT container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_
55+
which provides superior performance to the PyTorch container.
5656

5757
.. tab:: Local environment (PIP / Conda)
5858

@@ -76,9 +76,8 @@ Environment setup
7676

7777
If you wish to use ModelOpt in conjunction with other NVIDIA libraries (e.g. TensorRT, TensorRT-LLM, NeMo, Triton, etc.),
7878
please make sure to check the ease of installation of these libraries in a local environment. If you face any
79-
issues, we recommend using a docker image for a seamless experience. For example, `TensorRT-LLM documentation <https://nvidia.github.io/TensorRT-LLM/>`_.
80-
requires installing in a docker image. You may still choose to use other ModelOpt's features locally for example,
81-
quantizing a HuggingFace model and then use a docker image for deployment.
79+
issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's
80+
features locally for example, quantizing a HuggingFace model and then use a docker image for deployment.
8281

8382
Install Model Optimizer
8483
=======================

examples/llm_autodeploy/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This guide demonstrates how to deploy mixed-precision models using ModelOpt's Au
88

99
## Prerequisites
1010

11-
AutoDeploy is currently available on the main branch of TRT-LLM. Follow the [docker setup instructions](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation/build-from-source-linux.md#option-1-build-tensorrt-llm-in-one-step) to get started.
11+
AutoDeploy is available in TensorRT-LLM docker images. Please refer to our [Installation Guide](../../README.md#installation) for more details.
1212

1313
### 1. Quantize and Deploy Model
1414

examples/onnx_ptq/README.md

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,7 @@ Model Optimizer enables highly performant quantization formats including NVFP4,
2424

2525
### Docker
2626

27-
Build from this [Dockerfile](./docker/Dockerfile) which includes the latest publicly available TensorRT version, providing access to cutting-edge features and superior performance.
28-
29-
Build the Docker image (will be tagged `docker.io/library/onnx_ptq_examples:latest`)
30-
31-
```bash
32-
./docker/build.sh
33-
```
34-
35-
Run the docker image
36-
37-
```bash
38-
docker run --user 0:0 -it --gpus all --shm-size=2g -v /path/to/ImageNet/dataset:/workspace/imagenet docker.io/library/onnx_ptq_examples:latest
39-
```
27+
Please refer to our [Installation Guide](../../README.md#installation) for recommended docker images.
4028

4129
### Local Installation
4230

examples/onnx_ptq/docker/Dockerfile

Lines changed: 0 additions & 34 deletions
This file was deleted.

examples/onnx_ptq/docker/build.sh

Lines changed: 0 additions & 131 deletions
This file was deleted.

0 commit comments

Comments
 (0)