Name	Name	Last commit message	Last commit date
parent directory ..
.gitignore	.gitignore
README.md	README.md
_copy.json.config	_copy.json.config
inference_model.json	inference_model.json
inference_sample.ipynb	inference_sample.ipynb
info.yml	info.yml
model_project.config	model_project.config
phi4_ov_config.json	phi4_ov_config.json
phi4_ov_config.json.config	phi4_ov_config.json.config
phi4_ov_npu_config.json	phi4_ov_npu_config.json
phi4_ov_npu_config.json.config	phi4_ov_npu_config.json.config
phi4_qnn.json	phi4_qnn.json
phi4_qnn.json.config	phi4_qnn.json.config
phi4_vitis_ai_config.json	phi4_vitis_ai_config.json
phi4_vitis_ai_config.json.config	phi4_vitis_ai_config.json.config
requirements.txt	requirements.txt
winml.py	winml.py

Name

Last commit message

Last commit date

inference_sample.ipynb

info.yml

model_project.config

phi4_ov_config.json

phi4_ov_config.json.config

phi4_ov_npu_config.json

phi4_ov_npu_config.json.config

phi4_qnn.json

phi4_qnn.json.config

phi4_vitis_ai_config.json

phi4_vitis_ai_config.json.config

requirements.txt

winml.py

Phi-4-mini-instruct Quantization

This folder contains a sample use case of Olive to optimize a Phi-4-mini-instruct model using OpenVINO tools.

Intel® GPU: Phi 4 Mini Instruct Dynamic Shape Model
Intel® NPU: Phi 4 Mini Instruct Dynamic Shape Model

Quantization Workflows

This workflow performs quantization with Optimum Intel®. It performs the optimization pipeline:

HuggingFace Model -> Quantized OpenVINO model -> Quantized encapsulated ONNX OpenVINO IR model

Phi 4 Mini Instruct Dynamic Shape Model

The flow in following config file executes the above workflow producing a dynamic shape model.

How to run

Setup

Install the necessary python packages:

python -m pip install olive-ai[openvino]

Run Olive config

The optimization techniques to run are specified in the relevant config json file.

Optimize the model using the following command:

olive run --config <config_file.json>

Example:

olive run --config phi4_ov_config.json

or run simply with python code:

from olive import run
workflow_output = run("<config_file.json>")

After running the above command, the model candidates and corresponding config will be saved in the output directory.

(Optional) Run Console-Based Chat Interface

To run ONNX OpenVINO IR Encapsulated GenAI models, please setup latest ONNXRuntime GenAI with ONNXRuntime OpenVINO EP support.

The sample chat app to run is found as model-chat.py in the onnxruntime-genai Github repository.

The sample command to run after all setup would be as follows:

python model-chat.py -e follow_config -v -g -m models/<model_folder>/model/

Example:

python model-chat.py -e follow_config -v -g -m models/Phi-4-mini-instruct/model/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Phi-4-mini-instruct Quantization

Quantization Workflows

Phi 4 Mini Instruct Dynamic Shape Model

How to run

Setup

Run Olive config

(Optional) Run Console-Based Chat Interface

FilesExpand file tree

aitk

Directory actions

More options

Directory actions

More options

Latest commit

History

aitk

Folders and files

parent directory

README.md

Phi-4-mini-instruct Quantization

Quantization Workflows

Phi 4 Mini Instruct Dynamic Shape Model

How to run

Setup

Run Olive config

(Optional) Run Console-Based Chat Interface