Update windows readme

kevalmorabia97 · kevalmorabia97 · commit 2bca299bb3a8 · 2024-11-20T12:26:40.000+05:30
diff --git a/windows/Benchmark.md b/windows/Benchmark.md
@@ -1,6 +1,6 @@
 # TensorRT Model Optimizer - Windows: Benchmark Reference
 
-This document provides a summary of the performance and accuracy measurements of [TensorRT Model Optimizer - Windows](https://github.com/NVIDIA/TensorRT-Model-Optimizer) for several popular models. The benchmark results in the following tables serve as reference points and **should not be viewed as the maximum performance** achievable by Model Optimizer - Windows.
+This document provides a summary of the performance and accuracy measurements of [TensorRT Model Optimizer - Windows](./README.md) for several popular models. The benchmark results in the following tables serve as reference points and **should not be viewed as the maximum performance** achievable by Model Optimizer - Windows.
 
 ### 1 Performance And Accuracy Comparison: ONNX INT4 vs ONNX FP16 Models
 
diff --git a/windows/README.md b/windows/README.md
@@ -6,7 +6,7 @@
 
 [![Documentation](https://img.shields.io/badge/Documentation-latest-brightgreen.svg?style=flat)](https://nvidia.github.io/TensorRT-Model-Optimizer/)
 [![version](https://img.shields.io/pypi/v/nvidia-modelopt?label=Release)](https://pypi.org/project/nvidia-modelopt/)
-[![license](https://img.shields.io/badge/License-MIT-blue)](./LICENSE)
+[![license](https://img.shields.io/badge/License-MIT-blue)](../LICENSE)
 
 [Examples](#examples) |
 [Benchmark Results](#benchmark-results)
@@ -42,10 +42,10 @@ ModelOpt-Windows can be installed either as a standalone toolkit or through Micr
 
 ### Standalone Toolkit Installation (with CUDA 12.x)
 
-To install ModelOpt-Windows as a standalone toolkit with CUDA 12.x support, run the following commands:
+To install ModelOpt-Windows as a standalone toolkit on CUDA 12.x systems, run the following commands:
 
 ```bash
-pip install nvidia-modelopt[onnx]~=0.19.0 --extra-index-url https://pypi.nvidia.com
+pip install nvidia-modelopt[onnx] --extra-index-url https://pypi.nvidia.com
 pip install cupy-cuda12x
 ```
 
@@ -71,12 +71,12 @@ For more details, please refer to the [detailed quantization guide](https://nvid
 
 ## Examples
 
-- [PTQ for LLMs](./onnx_ptq/README.md) covers how to use Post-training quantization (PTQ) and deployment with DirectML
+- [PTQ for LLMs](./onnx_ptq/README.md) covers how to use ONNX Post-Training Quantization (PTQ) and deployment with DirectML
 - [MMLU Benchmark](./accuracy_benchmark/README.md) provides an example script for MMLU benchmark and demonstrates how to run it with various popular backends like DirectML, TensorRT-LLM\* and model formats like ONNX and PyTorch\*.
 
 ## Support Matrix
 
-Please refer to [feature support matrix](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/windows/_feature_support_matrix.html) for a full list of supported features.
+Please refer to [support matrix](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/windows/_feature_support_matrix.html) for a full list of supported features and models.
 
 ## Benchmark Results
 
diff --git a/windows/accuracy_benchmark/README.md b/windows/accuracy_benchmark/README.md
@@ -6,11 +6,6 @@
   - [MMLU (Massive Multitask Language Understanding)](#mmlu-massive-multitask-language-understanding)
     - [Setup](#setup)
     - [Evaluation Methods](#evaluation-methods)
-      - [1. Evaluate with ORT-DML using GenAI APIs](#1-evaluate-with-ort-dml-using-genai-apis)
-      - [2. Evaluate with ORT-DML, CUDA, and CPU using Native ORT Path](#2-evaluate-with-ort-dml-cuda-and-cpu-using-native-ort-path)
-      - [3. Evaluate the PyTorch Model of HF Weights](#3-evaluate-the-pytorch-model-of-hf-weights)
-      - [4. Evaluate the TensorRT-LLM](#4-evaluate-the-tensorrt-llm)
-      - [5. Evaluate the PyTorch Model Quantized with AutoAWQ](#5-evaluate-the-pytorch-model-quantized-with-autoawq)
 
 ## Overview
 
diff --git a/windows/onnx_ptq/README.md b/windows/onnx_ptq/README.md
@@ -77,6 +77,17 @@ Refer to the following example scripts and tutorials for deployment:
 1. [ORT GenAI examples](https://github.com/microsoft/onnxruntime-genai/tree/main/examples/python)
 1. [ONNX Runtime documentation](https://onnxruntime.ai/docs/api/python/)
 
+### Model Support Matrix
+
+Model | int4_awq
+--- | ---
+Llama3.1-8B-Instruct | Yes
+Phi3.5-mini-Instruct | Yes
+Mistral-7B-Instruct-v0.3 | Yes
+Llama3.2-3B-Instruct| Yes
+Gemma-2b-it | Yes
+Nemotron Mini 4B Instruct  | Yes
+
 ### Troubleshoot
 
 1. **Configure Directories**