Skip to content

Commit 80b9f8e

Browse files
Merge pull request #2432 from chrismoroney/cmoroney-vllm-on-arm-last-reviewed-10-2025
Build and Run vLLM on Arm Servers LP - update dtype
2 parents b962a17 + bea94d7 commit 80b9f8e

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

content/learning-paths/servers-and-cloud-computing/vllm/vllm-run.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,11 +41,14 @@ prompts = [
4141
"Write a hello world program in Rust",
4242
]
4343

44+
# Modify model here
45+
MODEL = "Qwen/Qwen2.5-0.5B-Instruct"
46+
4447
# Create a sampling params object.
4548
sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=256)
4649

4750
# Create an LLM.
48-
llm = LLM(model="Qwen/Qwen2.5-0.5B-Instruct", dtype="bfloat16")
51+
llm = LLM(model=MODEL, dtype="bfloat16")
4952

5053
# Generate texts from the prompts. The output is a list of RequestOutput objects
5154
# that contain the prompt, generated text, and other information.

content/learning-paths/servers-and-cloud-computing/vllm/vllm-setup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ layout: learningpathall
88

99
## Before you begin
1010

11-
To follow the instructions for this Learning Path, you will need an Arm server running Ubuntu 24.04 LTS with at least 8 cores, 16GB of RAM, and 50GB of disk storage.
11+
To follow the instructions for this Learning Path, you will need an Arm server running Ubuntu 24.04 LTS with at least 8 cores, 16GB of RAM, and 50GB of disk storage. The instructions have been tested on an AWS Graviton3 m7g.2xlarge instance.
1212

1313
## What is vLLM?
1414

0 commit comments

Comments
 (0)