NVIDIA
diff --git a/‎.github/ISSUE_TEMPLATE/1_bug_report.md‎
Lines changed: 0 additions & 4 deletions b/‎.github/ISSUE_TEMPLATE/1_bug_report.md‎
Lines changed: 0 additions & 4 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/2_feature_request.md‎
Lines changed: 0 additions & 3 deletions b/‎.github/ISSUE_TEMPLATE/2_feature_request.md‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 0 additions & 2 deletions b/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎.markdownlint-cli2.yaml‎
Lines changed: 7 additions & 0 deletions b/‎.markdownlint-cli2.yaml‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 22 additions & 23 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 22 additions & 23 deletions
diff --git a/‎CHANGELOG.rst‎
Lines changed: 4 additions & 4 deletions b/‎CHANGELOG.rst‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docker/README.md‎
Lines changed: 16 additions & 0 deletions b/‎docker/README.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎docs/source/deployment/3_unified_hf.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/deployment/3_unified_hf.rst‎
Lines changed: 1 addition & 1 deletion
@@ -9,15 +9,12 @@ assignees: ''
 ## Describe the bug
 <!-- Description of what the bug is, its impact (blocker, should have, nice to have) and any stack traces or error messages. -->
 
-
 ### Steps/Code to reproduce bug
 <!-- Please list *minimal* steps or code snippet for us to be able to reproduce the bug. -->
 <!-- A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports. -->
 
-
 ### Expected behavior
 
-
 ## System information
 
 - Container used (if applicable): ?
@@ -37,7 +34,6 @@ assignees: ''
   - TensorRT: ?
 - Any other details that may help: ?
 
-
 <details>
 <summary><b>Click to expand: Python script to automatically collect system information</b></summary>
 
 
@@ -9,13 +9,10 @@ assignees: ''
 ### Detailed description of the requested feature
 <!-- Description of the feature being requested. Also provide any relevant information on what the feature will be used for -->
 
-
 ### Timeline
 <!-- What time-frame do you need this feature by and what is the impact (blocker, should have, nice to have) of not having the feature -->
 
-
 ### Describe alternatives you've considered
 
-
 ### Target hardware/use case
 <!-- Target hardware/use case this feature will be used for -->
@@ -14,7 +14,6 @@
 ## Testing
 <!-- Mention how have you tested your change if applicable. -->
 
-
 ## Before your PR is "*Ready for review*"
 <!-- If you haven't finished some of the above items you can still open `Draft` PR. -->
 
@@ -24,6 +23,5 @@
 - **Did you add or update any necessary documentation?**: Yes/No
 - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. -->
 
-
 ## Additional Information
 <!-- E.g. related issue. -->
@@ -0,0 +1,7 @@
+config:
+  MD013: false # line-length
+  MD024: false # no-duplicate-heading
+  MD028: false # no-blanks-blockquote
+  MD033: false # no-inline-html
+  MD041: false # first-line-heading
+  MD059: false # no-hard-tabs
@@ -1,7 +1,7 @@
 # NOTE: Make sure to update version in dev requirements (setup.py) as well!
 repos:
   - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v5.0.0
+    rev: v6.0.0
     hooks:
       - id: check-added-large-files
         args: [--maxkb=500, --enforce-all]
@@ -15,35 +15,24 @@ repos:
       - id: check-merge-conflict
       - id: check-symlinks
       - id: check-toml
-      - id: check-yaml
-        args: [--allow-multiple-documents]
-      - id: debug-statements
-      - id: end-of-file-fixer
       - id: mixed-line-ending
         args: [--fix=lf]
       - id: requirements-txt-fixer
-      - id: trailing-whitespace
-
-  - repo: https://github.com/executablebooks/mdformat
-    rev: 0.7.22
-    hooks:
-      - id: mdformat
-        exclude: ^.github/
 
   - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.11.9
+    rev: v0.12.11
     hooks:
-      - id: ruff
+      - id: ruff-check
         args: [--fix, --exit-non-zero-on-fix]
       - id: ruff-format
 
   - repo: https://github.com/pre-commit/mirrors-mypy
-    rev: v1.15.0
+    rev: v1.17.1
     hooks:
       - id: mypy
 
   - repo: https://github.com/pre-commit/mirrors-clang-format
-    rev: v20.1.0
+    rev: v21.1.0
     hooks:
       - id: clang-format
         types_or: [c++, c, c#, cuda, java, javascript, objective-c, proto] # no json!
@@ -131,23 +120,33 @@ repos:
           - --allow-past-years
         types_or: [shell]
 
-  - repo: https://github.com/keith/pre-commit-buildifier
-    rev: 8.0.3
-    hooks:
-      - id: buildifier
-      - id: buildifier-lint
-
   - repo: https://github.com/PyCQA/bandit
     rev: 1.7.9
     hooks:
       - id: bandit
         args: ["-c", "pyproject.toml", "-q"]
         additional_dependencies: ["bandit[toml]"]
 
+  - repo: https://github.com/DavidAnson/markdownlint-cli2
+    rev: v0.18.1
+    hooks:
+      - id: markdownlint-cli2
+        args: ["--fix"]
+
+  ##### Manual hooks (Expect many false positives)
+  # These hooks are only run with `pre-commit run --all-files --hook-stage manual <hook_id>`
+
+  # Spell checker
+  - repo: https://github.com/crate-ci/typos
+    rev: v1.35.8
+    hooks:
+      - id: typos
+        stages: [manual]
+
   # Link checker
   - repo: https://github.com/lycheeverse/lychee.git
     rev: v0.15.1
     hooks:
       - id: lychee
         args: ["--no-progress", "--exclude-loopback"]
-        stages: [manual] # Only run with `pre-commit run --all-files --hook-stage manual lychee`
+        stages: [manual]
@@ -18,7 +18,7 @@ Model Optimizer Changelog (Linux)
 - Add support for QAT with HuggingFace + DeepSpeed. See ``examples/gpt_oss`` for an example.
 - Add support for QAT with LoRA. The LoRA adapters can be folded into the base model after QAT and deployed just like a regular PTQ model. See ``examples/gpt_oss`` for an example.
 - ModelOpt provides convenient trainers such as :class:`QATTrainer`, :class:`QADTrainer`, :class:`KDTrainer`, :class:`QATSFTTrainer` which inherits from Huggingface trainers.
-  ModelOpt trainers can be used as drop in replacement of the correspoding Huggingface trainer. See usage examples in ``examples/gpt_oss``, ``examples/llm_qat`` or ``examples/llm_distill``.
+  ModelOpt trainers can be used as drop in replacement of the corresponding Huggingface trainer. See usage examples in ``examples/gpt_oss``, ``examples/llm_qat`` or ``examples/llm_distill``.
 - (Experimental) Add quantization support for custom TensorRT op in ONNX models.
 - Add support for Minifinetuning (MFT; https://arxiv.org/abs/2506.15702) self-corrective distillation, which enables training on small datasets with severely mitigated catastrophic forgetting.
 - Add tree decoding support for Megatron Eagle models.
@@ -55,8 +55,8 @@ Model Optimizer Changelog (Linux)
 
 - NeMo and Megatron-LM distributed checkpoint (``torch-dist``) stored with legacy version can no longer be loaded. The remedy is to load the legacy distributed checkpoint with 0.29 and store a ``torch`` checkpoint and resume with 0.31 to convert to a new format. The following changes only apply to storing and resuming distributed checkpoint.
     - ``quantizer_state`` of :class:`TensorQuantizer <modelopt.torch.quantization.nn.modules.TensorQuantizer>` is now stored in ``extra_state`` of :class:`QuantModule <modelopt.torch.quantization.nn.module.QuantModule>` where it used to be stored in the sharded ``modelopt_state``.
-    - The dtype and shape of ``amax`` and ``pre_quant_scale`` stored in the distributed checkpoint are now retored. Some dtype and shape are previously changed to make all decoder layers to have homogeneous structure in the checkpoint.
-    - Togather with megatron.core-0.13, quantized model will store and resume distributed checkpoint in a heterogenous format.
+    - The dtype and shape of ``amax`` and ``pre_quant_scale`` stored in the distributed checkpoint are now restored. Some dtype and shape are previously changed to make all decoder layers to have homogeneous structure in the checkpoint.
+    - Together with megatron.core-0.13, quantized model will store and resume distributed checkpoint in a heterogenous format.
 - auto_quantize API now accepts a list of quantization config dicts as the list of quantization choices.
     - This API previously accepts a list of strings of quantization format names. It was therefore limited to only pre-defined quantization formats unless through some hacks.
     - With this change, now user can easily use their own custom quantization formats for auto_quantize.
@@ -146,7 +146,7 @@ Model Optimizer Changelog (Linux)
 **New Features**
 
 - Support fast hadamard transform in :class:`TensorQuantizer <modelopt.torch.quantization.nn.modules.TensorQuantizer>`.
-  It can be used for rotation based quantization methods, e.g. QuaRot. Users need to install the package `fast_hadamard_transfrom <https://github.com/Dao-AILab/fast-hadamard-transform>`_ to use this feature.
+  It can be used for rotation based quantization methods, e.g. QuaRot. Users need to install the package `fast_hadamard_transform <https://github.com/Dao-AILab/fast-hadamard-transform>`_ to use this feature.
 - Add affine quantization support for the KV cache, resolving the low accuracy issue in models such as Qwen2.5 and Phi-3/3.5.
 - Add FSDP2 support. FSDP2 can now be used for QAT.
 - Add `LiveCodeBench <https://livecodebench.github.io/>`_  and `Simple Evals <https://github.com/openai/simple-evals>`_ to the ``llm_eval`` examples.
 
@@ -105,13 +105,13 @@ git push origin <branch> --force-with-lease
 
   This will append the following to your commit message:
 
-  ```
+  ```text
   Signed-off-by: Your Name <[email protected]>
   ```
 
 - Full text of the Developer Certificate of Origin (DCO):
 
-  ```
+  ```text
     Developer Certificate of Origin
     Version 1.1
 
 
@@ -123,7 +123,7 @@ Visit our [installation guide](https://nvidia.github.io/TensorRT-Model-Optimizer
 Model Optimizer is now open source! We welcome any feedback, feature requests and PRs.
 Please read our [Contributing](./CONTRIBUTING.md) guidelines for details on how to contribute to this project.
 
-### Top Contributers
+### Top Contributors
 
 [![Contributors](https://contrib.rocks/image?repo=NVIDIA/TensorRT-Model-Optimizer)](https://github.com/NVIDIA/TensorRT-Model-Optimizer/graphs/contributors)
 
 
@@ -0,0 +1,16 @@
+# ModelOpt Docker
+
+This folder contains the Dockerfile for the ModelOpt docker image.
+
+## Building the Docker Image
+
+To build the docker image, run the following command from the root of the repository:
+
+```bash
+bash docker/build.sh
+```
+
+The docker image will be built and tagged as `docker.io/library/modelopt_examples:latest`.
+
+> [!NOTE]
+> For ONNX PTQ, use the optimized docker image from [onnx_ptq Dockerfile](../examples/onnx_ptq/docker/) instead of this one.
@@ -164,7 +164,7 @@ Deployment with Selected Inference Frameworks
 
 .. tab:: SGLang
 
-    Follow the `SGLang installation instructions. <https://docs.sglang.ai/start/install.html>`_
+    Follow the `SGLang installation instructions. <https://docs.sglang.ai/get_started/install.html>`_
 
     Currently we support fp8 quantized models (without fp8 kv cache) for SGLang deployment, you need to use the main branch of SGLang (since Jan 6, 2025) and build it from source.