-
Notifications
You must be signed in to change notification settings - Fork 231
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Model Server does not support The GPT-OSS 20B model.
Since September, I haven't been able to run gpt-oss 20b.
I always get the same error with the GPU plugin.
No matter which version I install, I always get the same result. On both Windows and Linux.
Windows 11
Driver 8331
A770 (x2)
I followed the instructions from here:
https://github.com/openvinotoolkit/model_server/tree/main/demos/code_local_assistant
(ovms) C:\openvino\model_server>docker run -d --rm -p 9000:9000 -p 8080:8080 ^
More? -v "C:\openvino\model_server\models:/models:rw" ^
More? --name ovms-server ^
More? openvino/model_server:weekly ^
More? --port 9000 ^
More? --rest_port 8080 ^
More? --config_path /models/config_all.json
714c6524842e0acdca69957dfa19fa5153de099aae1dfe915fbd60302c998df8
(ovms) C:\openvino\model_server>docker logs ovms-server
[2025-12-29 18:10:12.156][1][serving][info][server.cpp:88] OpenVINO Model Server 2026.0.0.9944c3235
[2025-12-29 18:10:12.156][1][serving][info][server.cpp:89] OpenVINO backend 2026.0.0.0.dev20251222
[2025-12-29 18:10:12.156][1][serving][info][pythoninterpretermodule.cpp:37] PythonInterpreterModule starting
[2025-12-29 18:10:12.317][1][serving][info][pythoninterpretermodule.cpp:50] PythonInterpreterModule started
[2025-12-29 18:10:12.656][1][modelmanager][info][modelmanager.cpp:156] Available devices for Open VINO: CPU
[2025-12-29 18:10:12.657][1][serving][info][capimodule.cpp:40] C-APIModule starting
[2025-12-29 18:10:12.657][1][serving][info][capimodule.cpp:42] C-APIModule started
[2025-12-29 18:10:12.658][1][serving][info][grpcservermodule.cpp:110] GRPCServerModule starting
[2025-12-29 18:10:12.658][1][serving][info][grpcservermodule.cpp:137] Binding gRPC server to address: 0.0.0.0:9000
[2025-12-29 18:10:12.663][1][serving][info][grpcservermodule.cpp:192] GRPCServerModule started
[2025-12-29 18:10:12.663][1][serving][info][grpcservermodule.cpp:193] Started gRPC server on port 9000
[2025-12-29 18:10:12.663][1][serving][info][httpservermodule.cpp:35] HTTPServerModule starting
[2025-12-29 18:10:12.663][1][serving][info][httpservermodule.cpp:39] Will start 16 REST workers
[2025-12-29 18:10:12.666][53][serving][info][drogon_http_server.cpp:137] Binding REST server to address: 0.0.0.0:8080
[2025-12-29 18:10:12.716][1][serving][info][drogon_http_server.cpp:164] REST server listening on port 8080 with 16 unary threads and 16 streaming threads
[2025-12-29 18:10:12.716][1][serving][info][http_server.cpp:248] API key not provided via --api_key_file or API_KEY environment variable. Authentication will be disabled.
[2025-12-29 18:10:12.717][1][serving][info][httpservermodule.cpp:52] HTTPServerModule started
[2025-12-29 18:10:12.717][1][serving][info][httpservermodule.cpp:53] Started REST server at 0.0.0.0:8080
[2025-12-29 18:10:12.717][1][serving][info][servablemanagermodule.cpp:51] ServableManagerModule starting
[2025-12-29 18:10:12.738][1][serving][info][mediapipegraphdefinition.cpp:423] MediapipeGraphDefinition initializing graph nodes
[2025-12-29 18:10:12.741][1][modelmanager][info][servable_initializer.cpp:420] Initializing Language Model Continuous Batching servable
[2025-12-29 18:10:14.442][1][serving][error][servable_initializer.cpp:214] Error during llm node initialization for models_path: /models/openai/gpt-oss-20b/./ exception: Exception from src/inference/src/cpp/core.cpp:116:
Exception from src/inference/src/dev/plugin.cpp:111:
Check '!m_device_map.empty()' failed at src/plugins/intel_gpu/src/plugin/plugin.cpp:516:
[GPU] Can't get PERFORMANCE_HINT property as no supported devices found or an error happened during devices query.
[GPU] Please check OpenVINO documentation for GPU drivers setup guide.
[2025-12-29 18:10:14.442][1][modelmanager][error][servable_initializer.cpp:425] Error during LLM node resources initialization: The LLM Node resource initialization failed
[2025-12-29 18:10:14.442][1][serving][error][mediapipegraphdefinition.cpp:474] Failed to process LLM node graph openai/gpt-oss-20b
[2025-12-29 18:10:14.442][1][modelmanager][info][pipelinedefinitionstatus.hpp:59] Mediapipe: openai/gpt-oss-20b state changed to: LOADING_PRECONDITION_FAILED after handling: ValidationFailedEvent:
[2025-12-29 18:10:14.442][104][modelmanager][info][modelmanager.cpp:1200] Started model manager thread
[2025-12-29 18:10:14.442][1][serving][info][servablemanagermodule.cpp:55] ServableManagerModule started
[2025-12-29 18:10:14.442][105][modelmanager][info][modelmanager.cpp:1219] Started cleaner thread
(ovms) C:\openvino\model_server>uv pip list
Using Python 3.12.12 environment at: C:\Users\uuk\miniconda3\envs\ovms
Package Version
------------------------- ----------------------
accelerate 1.11.0
attrs 25.4.0
diffusers 0.35.2
einops 0.8.1
filelock 3.20.1
fsspec 2025.12.0
huggingface-hub 0.36.0
joblib 1.5.3
jsonschema 4.25.1
jsonschema-specifications 2025.9.1
markdown-it-py 4.0.0
mdurl 0.1.2
ml-dtypes 0.5.4
mpmath 1.3.0
networkx 3.4.2
ninja 1.13.0
nncf 3.0.0.dev0+b8243e97
numpy 2.2.6
onnx 1.20.0
openvino 2026.0.0.dev20251222
openvino-telemetry 2025.2.0
openvino-tokenizers 2026.0.0.0.dev20251222
optimum 2.1.0.dev0
optimum-intel 1.27.0.dev0+25fcb63
optimum-onnx 0.1.0.dev0
pandas 2.3.3
pillow 12.0.0
pip 25.3
pydot 3.0.4
pyparsing 3.3.1
pyyaml 6.0.3
referencing 0.37.0
regex 2025.11.3
rich 14.2.0
rpds-py 0.30.0
safetensors 0.7.0
scikit-learn 1.8.0
scipy 1.17.0rc1
sentence-transformers 5.2.0
sentencepiece 0.2.1
setuptools 80.9.0
sympy 1.14.0
tabulate 0.9.0
threadpoolctl 3.6.0
timm 1.0.22
tokenizers 0.21.4
torch 2.9.1+cpu
torchvision 0.24.1+cpu
tqdm 4.67.1
transformers 4.55.4
wheel 0.45.1
CONDA INFO
(ovms) C:\openvino\model_server>conda info
active environment : ovms
active env location : C:\Users\uuk\miniconda3\envs\ovms
shell level : 1
user config file : C:\Users\uuk\.condarc
populated config files : C:\Users\uuk\miniconda3\.condarc
C:\Users\uuk\.condarc
conda version : 25.1.1
conda-build version : not installed
python version : 3.12.9.final.0
solver : libmamba (default)
virtual packages : __archspec=1=haswell
__conda=25.1.1=0
__win=10.0.22631=0
base environment : C:\Users\uuk\miniconda3 (writable)
conda av data dir : C:\Users\uuk\miniconda3\etc\conda
conda av metadata url : None
channel URLs : https://conda.anaconda.org/conda-forge/win-64
https://conda.anaconda.org/conda-forge/noarch
https://software.repos.intel.com/python/conda/win-64
https://software.repos.intel.com/python/conda/noarch
https://repo.anaconda.com/pkgs/main/win-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/win-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/msys2/win-64
https://repo.anaconda.com/pkgs/msys2/noarch
package cache : C:\Users\uuk\miniconda3\pkgs
C:\Users\uuk\.conda\pkgs
C:\Users\uuk\AppData\Local\conda\conda\pkgs
envs directories : C:\Users\uuk\miniconda3\envs
C:\Users\uuk\.conda\envs
C:\Users\uuk\AppData\Local\conda\conda\envs
platform : win-64
user-agent : conda/25.1.1 requests/2.32.5 CPython/3.12.9 Windows/11 Windows/10.0.22631 solver/libmamba conda-libmamba-solver/25.1.1 libmambapy/2.0.5 aau/0.5.0 c/. s/. e/.
administrator : False
netrc file : None
offline mode : False
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working