You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ Model Optimizer Changelog (Linux)
6
6
7
7
**Deprecations**
8
8
9
-
- Deprecated ModelOpt's custom docker image. Please use the TensorRT-LLM docker image directly or refer to the `installation guide <https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html>`_ for more details.
9
+
- Deprecated ModelOpt's custom docker images. Please use the PyTorch, TensorRT-LLM or TensorRT docker image directly or refer to the `installation guide <https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html>`_ for more details.
10
10
- Deprecated ``quantize_mode`` argument in ``examples/onnx_ptq/evaluate.py`` to support strongly typing. Use ``engine_precision`` instead.
11
11
- Deprecated TRT-LLM's TRT backend in ``examples/llm_ptq`` and ``examples/vlm_ptq``. Tasks ``build`` and ``benchmark`` support are removed and replaced with ``quant``. For performance evaluation, please use ``trtllm-bench`` directly.
12
12
- ``--export_fmt`` flag in ``examples/llm_ptq`` is removed. By default we export to the unified Hugging Face checkpoint format.
You may need to install additional dependencies from the respective examples's `requirements.txt` file.
@@ -51,8 +51,8 @@ Environment setup
51
51
and for NVIDIA NeMo framework, you can use the `NeMo container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags>`_.
52
52
Both of these containers come with Model Optimizer pre-installed. Make sure to update the Model Optimizer to the latest version if not already.
53
53
54
-
For ONNX PTQ, you can use the docker image from `onnx_ptq Dockerfile<https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/onnx_ptq/docker>`_
55
-
which includes the latest publicly available TensorRT version, providing access to cutting-edge features and superior performance.
54
+
For ONNX / TensorRT use cases, you can also use the `TensorRT container<https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_
55
+
which provides superior performance to the PyTorch container.
56
56
57
57
.. tab:: Local environment (PIP / Conda)
58
58
@@ -76,9 +76,8 @@ Environment setup
76
76
77
77
If you wish to use ModelOpt in conjunction with other NVIDIA libraries (e.g. TensorRT, TensorRT-LLM, NeMo, Triton, etc.),
78
78
please make sure to check the ease of installation of these libraries in a local environment. If you face any
79
-
issues, we recommend using a docker image for a seamless experience. For example, `TensorRT-LLM documentation <https://nvidia.github.io/TensorRT-LLM/>`_.
80
-
requires installing in a docker image. You may still choose to use other ModelOpt's features locally for example,
81
-
quantizing a HuggingFace model and then use a docker image for deployment.
79
+
issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's
80
+
features locally for example, quantizing a HuggingFace model and then use a docker image for deployment.
Copy file name to clipboardExpand all lines: examples/llm_autodeploy/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ This guide demonstrates how to deploy mixed-precision models using ModelOpt's Au
8
8
9
9
## Prerequisites
10
10
11
-
AutoDeploy is currently available on the main branch of TRT-LLM. Follow the [docker setup instructions](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation/build-from-source-linux.md#option-1-build-tensorrt-llm-in-one-step) to get started.
11
+
AutoDeploy is available in TensorRT-LLM docker images. Please refer to our [Installation Guide](../../README.md#installation) for more details.
Copy file name to clipboardExpand all lines: examples/onnx_ptq/README.md
+1-13Lines changed: 1 addition & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,19 +24,7 @@ Model Optimizer enables highly performant quantization formats including NVFP4,
24
24
25
25
### Docker
26
26
27
-
Build from this [Dockerfile](./docker/Dockerfile) which includes the latest publicly available TensorRT version, providing access to cutting-edge features and superior performance.
28
-
29
-
Build the Docker image (will be tagged `docker.io/library/onnx_ptq_examples:latest`)
30
-
31
-
```bash
32
-
./docker/build.sh
33
-
```
34
-
35
-
Run the docker image
36
-
37
-
```bash
38
-
docker run --user 0:0 -it --gpus all --shm-size=2g -v /path/to/ImageNet/dataset:/workspace/imagenet docker.io/library/onnx_ptq_examples:latest
39
-
```
27
+
Please refer to our [Installation Guide](../../README.md#installation) for recommended docker images.
0 commit comments