Skip to content

Commit be09664

Browse files
Minor fixes
Signed-off-by: Keval Morabia <[email protected]>
1 parent 1885e81 commit be09664

File tree

2 files changed

+7
-7
lines changed

2 files changed

+7
-7
lines changed

CHANGELOG.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,17 @@ Model Optimizer Changelog (Linux)
55
^^^^^^^^^^^^^^^^^
66

77
**Deprecations**
8+
89
- Deprecated ``quantize_mode`` argument in ``examples/onnx_ptq/evaluate.py`` to support strongly typing. Use ``engine_precision`` instead.
9-
- TRT-LLM's TRT backend in ``examples/llm_ptq`` and ``examples/vlm_ptq``. Tasks ``build`` and ``benchmark`` support are removed and replaced with ``quant``. For performance evaluation, please use ``trtllm-bench`` directly.
10+
- Deprecated TRT-LLM's TRT backend in ``examples/llm_ptq`` and ``examples/vlm_ptq``. Tasks ``build`` and ``benchmark`` support are removed and replaced with ``quant``. For performance evaluation, please use ``trtllm-bench`` directly.
1011
- ``--export_fmt`` flag in ``examples/llm_ptq`` is removed. By default we export to the unified Hugging Face checkpoint format.
11-
- ``int8_sq`` quantization format is deprecated from the ``examples/vlm_ptq`` respect to the TensorRT-LLM's torch backend switch. Please refer to the previous releases if this quantization format is needed.
12-
- ``examples/vlm_eval`` as it depends on the deprecated TRT-LLM's TRT backend.
13-
14-
**Bug Fixes**
12+
- ``int8_sq`` quantization format is deprecated from the ``examples/vlm_ptq`` with respect to the TensorRT-LLM's torch backend switch. Please refer to the previous releases if this quantization format is needed.
13+
- Deprecated ``examples/vlm_eval`` as it depends on the deprecated TRT-LLM's TRT backend.
1514

1615
**New Features**
16+
1717
- ``high_precision_dtype`` default to fp16 in ONNX quantization, i.e. quantized output model weights are now FP16 by default.
18+
- Upgrade TensorRT-LLM dependency to 1.1.0rc2.
1819

1920
0.35 (2025-09-04)
2021
^^^^^^^^^^^^^^^^^
@@ -27,7 +28,6 @@ Model Optimizer Changelog (Linux)
2728
**Bug Fixes**
2829

2930
- Fix attention head ranking logic for pruning Megatron Core GPT models.
30-
- Upgrade TensorRT-LLM dependency to 1.1.0rc2.
3131

3232
**New Features**
3333

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ disable_error_code = ["attr-defined"]
116116
# Default additional options
117117
# Show a short test summary info for all except passed tests with -ra flag
118118
# print execution time for 20 slowest tests and generate coverage reports
119-
#addopts = "-ra --cov-report=term-missing --cov-report=html --cov-report=xml:coverage.xml --cov-config=pyproject.toml --durations=20 --strict-markers"
119+
addopts = "-ra --cov-report=term-missing --cov-report=html --cov-report=xml:coverage.xml --cov-config=pyproject.toml --durations=20 --strict-markers"
120120
pythonpath = ["tests/"]
121121
markers = ["manual: Only run when --run-manual is given"]
122122

0 commit comments

Comments
 (0)