Fix npu docker run command in assistant demo (openvinotoolkit#3226)

dkalinowski · web-flow · commit 4b60f719f847 · 2025-04-10T15:40:30.000+02:00
diff --git a/demos/code_local_assistant/README.md b/demos/code_local_assistant/README.md
@@ -33,7 +33,7 @@ Since we do not want to wait for the code to appear, we need to use smaller mode
 Code completion works in non-streaming, unary mode. Do not use instruct model, there is no chat involved in the process.
 
 Export `Qwen/Qwen2.5-Coder-1.5B`:
-```baconsolesh
+```console
 python export_model.py text_generation --source_model Qwen/Qwen2.5-Coder-1.5B --weight-format int4 --config_file_path models/config_all.json --model_repository_path models --target_device NPU --overwrite_models
 ```
 
@@ -54,7 +54,7 @@ Examine that workspace is set up properly `models/config_all.json`:
 }
 ```
 
-```console
+```bash
 tree models
 models
 ├── codellama
@@ -100,13 +100,14 @@ Run OpenVINO Model Server with both models loaded at the same time:
 ### Windows: deploying on bare metal
 Please refer to OpenVINO Model Server installation first: [link](../../docs/deploying_server_baremetal.md)
 
-```console
+```bat
 ovms --rest_port 8000 --config_path ./models/config_all.json
 ```
 
 ### Linux: via Docker
 ```bash
-docker run -d --rm -v $(pwd)/:/workspace/ -p 8000:8000 openvino/model_server:2025.1 --rest_port 8000 --config_path /workspace/models/config_all.json
+docker run -d --rm --device /dev/accel --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) -u $(id -u):$(id -g) \
+  -p 8000:8000 -v $(pwd)/:/workspace/ openvino/model_server:2025.1 --rest_port 8000 --config_path /workspace/models/config_all.json
 ```
 
 ## Set Up Visual Studio Code