Skip to content

❌ device Hetero:GPU.0,GPU.1 issue on 2x A770 (Windows 11, 8331) #3162

@savvadesogle

Description

@savvadesogle

Hello.
I have a problem with device HETERO:GPU.0,GPU.1 and MODEL_DISTRIBUTION_POLICY="PIPELINE_PARALLEL".
Please, I ask for your help.

Windows 11
Driver 8331
2x A770 (16gb)

It's not working with https://huggingface.co/savvadesogle/T-pro-it-2.1-int4-ov
Based on the Qwen/Qwen3-32B architecture
I specifically chose a model that doesn't fit into a single GPU.

❌ TEST-HETERO - NOT WORKING (the error is not shown on 2025.4.1, but on 2026-dev it is shown)

import openvino_genai as ov_genai
model_path = "C:\\llm\\models\\ov\\T-pro-it-2.1-int4-ov"
device = "HETERO:GPU.0,GPU.1"
print('Device selected:', device)
pipe = ov_genai.LLMPipeline(
    model_path, 
    device, 
    MODEL_DISTRIBUTION_POLICY="PIPELINE_PARALLEL"
)
print('Pipeline initialized')
print(pipe.generate("How to make a tea?", max_new_tokens=100))
print('END')

✅ TEST-SINGLE-GPU - WORKING

import openvino_genai as ov_genai
model_path = "C:\\llm\\models\\ov\\T-pro-it-2.1-int4-ov"
device = "GPU"
print('Device selected:', device)
pipe = ov_genai.LLMPipeline(
    model_path, 
    device
)
print('Pipeline initialized')
print(pipe.generate("How to make a tea?", max_new_tokens=100))
print('END')
Image Image
metrics={'load_time (s)': 82.02, 'ttft (s)': 0.54, 'tpot (ms)': 161.09088, 'prefill_throughput (tokens/s)': 940.13, 'decode_throughput (tokens/s)': 6.20768, 'decode_duration (s)': 13.27085, 'input_token': 512, 'new_token': 80, 'total_token': 592, 'stream': False}

PIP LIST (the error is not shown in console, just exit)

Image Image
(genai) c:\openvino\genai\openvino.genai>uv pip list
Using Python 3.12.12 environment at: C:\Users\uuk\miniconda3\envs\genai
Package             Version
------------------- ----------
numpy               2.3.5
openvino            2025.4.1
openvino-genai      2025.4.1.0
openvino-telemetry  2025.2.0
openvino-tokenizers 2025.4.1.0
packaging           25.0
pip                 25.3
setuptools          80.9.0
wheel               0.45.1

PIP LIST (the error is shown in console, and then exit)

Image
(genai) c:\openvino\genai\openvino.genai>uv pip list
Using Python 3.12.12 environment at: C:\Users\uuk\miniconda3\envs\genai
Package             Version
------------------- ----------------------
numpy               2.3.5
openvino            2026.0.0.dev20260102
openvino-genai      2026.0.0.0.dev20260102
openvino-telemetry  2025.2.0
openvino-tokenizers 2026.0.0.0.dev20260102
packaging           25.0
pip                 25.3
setuptools          80.9.0
wheel               0.45.1
(genai) c:\openvino\genai\openvino.genai>python test-hetero.py 
Device selected: HETERO:GPU.0,GPU.1
Traceback (most recent call last):
  File "c:\openvino\genai\openvino.genai\test-hetero.py", line 5, in <module>
    pipe = ov_genai.LLMPipeline(
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src\inference\src\cpp\core.cpp:119:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\plugins\hetero\src\compiled_model.cpp:36:
Standard exception from compilation library: Exception from src\inference\src\dev\plugin.cpp:53:
Check 'false' failed at src\plugins\intel_gpu\src\plugin\program_builder.cpp:163:
[GPU] ProgramBuilder build failed!
Exception from src\plugins\intel_gpu\src\runtime\ocl\ocl_common.hpp:40:
[GPU] clEnqueueNDRangeKernel, error code: -52 CL_INVALID_KERNEL_ARGS

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions