Name	Name	Last commit message	Last commit date
parent directory ..
.gitignore	.gitignore
README.md	README.md
_copy.json.config	_copy.json.config
inference_model.json	inference_model.json
inference_sample.ipynb	inference_sample.ipynb
info.yml	info.yml
model_project.config	model_project.config
qwen2_5_ov_config.json	qwen2_5_ov_config.json
qwen2_5_ov_config.json.config	qwen2_5_ov_config.json.config
qwen2_5_ov_npu_config.json	qwen2_5_ov_npu_config.json
qwen2_5_ov_npu_config.json.config	qwen2_5_ov_npu_config.json.config
qwen2_5_trtrtx.json	qwen2_5_trtrtx.json
qwen2_5_trtrtx.json.config	qwen2_5_trtrtx.json.config
qwen2_5_vitis_ai_config.json	qwen2_5_vitis_ai_config.json
qwen2_5_vitis_ai_config.json.config	qwen2_5_vitis_ai_config.json.config
requirements.txt	requirements.txt
winml.py	winml.py

Name

Last commit message

Last commit date

inference_sample.ipynb

info.yml

model_project.config

qwen2_5_ov_config.json

qwen2_5_ov_config.json.config

qwen2_5_ov_npu_config.json

qwen2_5_ov_npu_config.json.config

qwen2_5_trtrtx.json

qwen2_5_trtrtx.json.config

qwen2_5_vitis_ai_config.json

qwen2_5_vitis_ai_config.json.config

requirements.txt

winml.py

Qwen2.5-Coder-7B-Instruct Model Optimization

This repository demonstrates the optimization of the Qwen2.5-Coder-7B-Instruct model using post-training quantization (PTQ) techniques.

OpenVINO for Intel® GPU/NPU
- This process uses OpenVINO specific passes like OpenVINOOptimumConversion, OpenVINOIoUpdate and OpenVINOEncapsulation
ModelBuilder for NVIDIA TRT for RTX GPU

Intel® Workflows

This workflow performs quantization with Optimum Intel®. It performs the optimization pipeline:

HuggingFace Model -> Quantized OpenVINO model -> Quantized encapsulated ONNX OpenVINO IR model

Inference

Run Console-Based Chat Interface

Execute the provided inference_sample.ipynb notebook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Qwen2.5-Coder-7B-Instruct Model Optimization

Intel® Workflows

Inference

Run Console-Based Chat Interface

FilesExpand file tree

aitk

Directory actions

More options

Directory actions

More options

Latest commit

History

aitk

Folders and files

parent directory

README.md

Qwen2.5-Coder-7B-Instruct Model Optimization

Intel® Workflows

Inference

Run Console-Based Chat Interface