Merge pull request #2537 from nikhil-arm/vllm_int4_fix

pareenaverma · web-flow · commit a80bb7768794 · 2025-11-11T16:18:28.000-05:00
[fix]: fix env issues with vLLM int4 acceleration LP
diff --git a/content/learning-paths/servers-and-cloud-computing/vllm-acceleration/1-overview-and-build.md b/content/learning-paths/servers-and-cloud-computing/vllm-acceleration/1-overview-and-build.md
@@ -42,7 +42,8 @@ Install the minimum system package used by vLLM on Arm:
 
 ```bash
 sudo apt-get update -y
-sudo apt-get install -y libnuma-dev
+sudo apt-get install -y build-essential cmake libnuma-dev
+sudo apt install python3.12-venv python3.12-dev
 ```
 
 Optional performance helper you can install now or later:
@@ -60,9 +61,9 @@ On aarch64, vLLM’s CPU backend automatically builds with Arm Compute Library v
 Create and activate a virtual environment:
 
 ```bash
-python3 -m venv vllm_env
+python3.12 -m venv vllm_env
 source vllm_env/bin/activate
-python -m pip install --upgrade pip
+python3 -m pip install --upgrade pip
 ```
 
 Clone vLLM and install build requirements:
diff --git a/content/learning-paths/servers-and-cloud-computing/vllm-acceleration/2-quantize-model.md b/content/learning-paths/servers-and-cloud-computing/vllm-acceleration/2-quantize-model.md
@@ -135,7 +135,7 @@ This script creates a Arm KleidiAI 4‑bit quantized copy of the vLLM model and
 
 ```bash
 # DeepSeek example
-python quantize_vllm_models.py deepseek-ai/DeepSeek-V2-Lite \
+python3 quantize_vllm_models.py deepseek-ai/DeepSeek-V2-Lite \
   --scheme channelwise --method mse
 ```