Error while using ORT with OpenVINO EP #25758

WinInsider · 2025-08-15T13:50:37Z

WinInsider
Aug 15, 2025

I am using Intel.ML.OnnxRuntime.OpenVino nuget package for OpenVino support on Intel UHD Graphics 630,... then have this C# code to enable OpenVino EP...

		using var sessionOptions = new SessionOptions
		{
			ExecutionMode = ExecutionMode.ORT_SEQUENTIAL,
			EnableMemoryPattern = true,
			LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_WARNING,
			GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL,
		};

		sessionOptions.AppendExecutionProvider_OpenVINO("GPU");

No other setup, or installation been done.

When I run following ONNX file: llmware/tiny-llama-chat-onnx/model.onnx (from huggingface) this error occurs:

[OpenVINO-EP] Output names mismatch between OpenVINO and ONNX

This model runs fine on CPU.

Questions:

What type of ONNX file is best suited for OpenVINO nuget package? (int4, int8, q4, fp16)
Is there any advantage of using OpenVino for Intel integrated graphics (UHD Graphics 630) over CPU execution?
Is there addition installation that needs to be done to get OpenVINO working on Windows 10?
My understanding is that OpenVINO nuget package allows for LLM model in ONNX format to run inference without any additional model format conversions.

Exception occures on:

Full warnings/error log:

2025-08-15 08:52:57.7334262 [W:onnxruntime:CSharpOnnxRuntime, openvino_provider_factory.cc:240 onnxruntime::openvino_ep::ParseProviderInfo::<lambda_2>::operator ()] Empty OV Config Map passed. Skipping load_config option parsing.

2025-08-15 08:53:00.6611546 [W:onnxruntime:CSharpOnnxRuntime, openvino_provider_factory.cc:240 onnxruntime::openvino_ep::ParseProviderInfo::<lambda_2>::operator ()] Empty OV Config Map passed. Skipping load_config option parsing.

2025-08-15 08:53:01.7632736 [W:onnxruntime:, session_state.cc:1280 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.

2025-08-15 08:53:01.7752132 [W:onnxruntime:, session_state.cc:1282 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

2025-08-15 08:53:06.0933579 [E:onnxruntime:, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running OpenVINO-EP-subgraph_1 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0' Status Message: C:\Users\Administrator\Documents\jatin\nuget_122_latest\onnxruntime\onnxruntime\core\providers\openvino\backend_utils.cc:216 struct Ort::detail::ValueImpl<struct Ort::detail::Unowned > __cdecl onnxruntime::openvino_ep::backend_utils::GetOutputTensor(struct Ort::KernelContext &,class std::basic_string<char,struct std::char_traits,class std::allocator >,const class std::unordered_map<class std::basic_string<char,struct std::char_traits,class std::allocator >,unsigned int,struct std::hash<class std::basic_string<char,struct std::char_traits,class std::allocator > >,struct std::equal_to<class std::basic_string<char,struct std::char_traits,class std::allocator > >,class std::allocator<struct std::pair<class std::basic_string<char,struct std::char_traits,class std::allocator > const ,unsigned int> > > &,class std::shared_ptr) [OpenVINO-EP] Output names mismatch between OpenVINO and ONNX

Answered by alishanawer

Aug 21, 2025

The error you’re seeing comes from a mismatch between the ONNX model outputs and what the OpenVINO Execution Provider (EP) can parse. A few important points:

1. Model format support

The Intel.ML.OnnxRuntime.OpenVino NuGet package does not support all ONNX models out of the box.
The LLaMA model you’re trying to run (llmware/tiny-llama-chat-onnx) is quantized and uses ops that are not yet fully supported by OpenVINO EP. That’s why you get:
```
[OpenVINO-EP] Output names mismatch between OpenVINO and ONNX
```
Best compatibility is with standard FP32 or FP16 ONNX models. INT8 is partially supported (mainly for CNNs), but int4/q4 quantized models are not.

2. GPU vs CPU on UHD 630

Intel UHD G…

View full answer

alishanawer · 2025-08-21T13:00:10Z

alishanawer
Aug 21, 2025

The error you’re seeing comes from a mismatch between the ONNX model outputs and what the OpenVINO Execution Provider (EP) can parse. A few important points:

1. Model format support

The Intel.ML.OnnxRuntime.OpenVino NuGet package does not support all ONNX models out of the box.
The LLaMA model you’re trying to run (llmware/tiny-llama-chat-onnx) is quantized and uses ops that are not yet fully supported by OpenVINO EP. That’s why you get:
```
[OpenVINO-EP] Output names mismatch between OpenVINO and ONNX
```
Best compatibility is with standard FP32 or FP16 ONNX models. INT8 is partially supported (mainly for CNNs), but int4/q4 quantized models are not.

2. GPU vs CPU on UHD 630

Intel UHD Graphics 630 has limited GPU compute and no XMX/AMX acceleration (which newer iGPUs have).
In practice, OpenVINO EP on UHD 630 often falls back to CPU for many ops. You may not see much gain vs plain CPU EP. Sometimes CPU EP is actually faster/stabler.

3. Installation requirements

The NuGet package alone is not always enough. You also need the OpenVINO runtime installed on the system (so that EP can find device drivers/plugins).
On Windows 10, install the latest [OpenVINO Runtime for Windows](https://docs.openvino.ai/latest/openvino_docs_install_guides_installing_openvino_windows.html) and make sure the DLLs are on your PATH.

4. What you can do

Try with a simpler ONNX model in FP32/FP16 first (e.g. ResNet50, BERT, MobileNet) to confirm OpenVINO EP is working.
For LLaMA/LLM style models, OpenVINO usually requires a model conversion step via the [OpenVINO Model Optimizer (MO)](https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html), not just running raw ONNX.
If you want to run HuggingFace LLaMA models efficiently on Intel GPUs/CPUs, the recommended route is:
1. Convert with optimum-intel (optimum-cli export openvino).
2. Load via the OpenVINO runtime, not just ORT.

Answering your questions directly:

Best ONNX type: FP32 or FP16 (avoid int4/q4 for OpenVINO EP).
Advantage on UHD 630: Very limited. CPU EP will likely be more stable and sometimes faster.
Extra installation: Yes, you need the OpenVINO runtime installed. NuGet is just a wrapper.
Direct ONNX inference without conversion: Only works for models with ops supported by OpenVINO EP. For LLMs, conversion is usually required.

Recommendation: Start with FP32/FP16 vision models to verify your setup, then use optimum-intel to export/convert LLMs for OpenVINO.

0 replies

WinInsider · 2025-08-21T13:07:34Z

WinInsider
Aug 21, 2025
Author

@alishanawer, thanks for the detail answer

1 reply

alishanawer Aug 21, 2025

You're welcome! Glad it helped 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error while using ORT with OpenVINO EP #25758

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Error while using ORT with OpenVINO EP #25758

Uh oh!

Uh oh!

WinInsider Aug 15, 2025

1. Model format support

2. GPU vs CPU on UHD 630

Replies: 2 comments · 1 reply

Uh oh!

alishanawer Aug 21, 2025

1. Model format support

2. GPU vs CPU on UHD 630

3. Installation requirements

4. What you can do

Uh oh!

WinInsider Aug 21, 2025 Author

Uh oh!

alishanawer Aug 21, 2025

WinInsider
Aug 15, 2025

Replies: 2 comments 1 reply

alishanawer
Aug 21, 2025

WinInsider
Aug 21, 2025
Author