Creating Model-specific assets for QNN EP after fine-tuning #1518

tc-wolf · 2025-05-30T14:35:08Z

tc-wolf
May 30, 2025

In https://onnxruntime.ai/docs/genai/howto/build-models-for-snapdragon.html#add-other-assets there is a reference to model-specific assets (tokenizer, quantizer/dequantizer/position processor ONNX models). These are provided for Phi-3.5-mini-instruct and Llama 3.2 3B Instruct, but how are these created?

If I wanted to deploy a fused fine-tuned model based on Llama 3.2 3B (or even a non-fine-tuned Llama 3.2 1B), I'd have to recreate these, and I can't find any information on how to do so.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Creating Model-specific assets for QNN EP after fine-tuning #1518

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Creating Model-specific assets for QNN EP after fine-tuning #1518

Uh oh!

tc-wolf May 30, 2025

Replies: 0 comments

tc-wolf
May 30, 2025