Skip to content
Discussion options

You must be logged in to vote

The error you’re seeing comes from a mismatch between the ONNX model outputs and what the OpenVINO Execution Provider (EP) can parse. A few important points:

1. Model format support

  • The Intel.ML.OnnxRuntime.OpenVino NuGet package does not support all ONNX models out of the box.

  • The LLaMA model you’re trying to run (llmware/tiny-llama-chat-onnx) is quantized and uses ops that are not yet fully supported by OpenVINO EP. That’s why you get:

    [OpenVINO-EP] Output names mismatch between OpenVINO and ONNX
    
  • Best compatibility is with standard FP32 or FP16 ONNX models. INT8 is partially supported (mainly for CNNs), but int4/q4 quantized models are not.

2. GPU vs CPU on UHD 630

  • Intel UHD G…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by WinInsider
Comment options

You must be logged in to vote
1 reply
@alishanawer
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
EP Q&A
Labels
None yet
2 participants