Skip to content

Commit d6fbdd5

Browse files
authored
add support for pipeline-parallel-size in vLLM example (#2370)
Signed-off-by: Andrew Sy Kim <[email protected]>
1 parent 3e68606 commit d6fbdd5

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

ray-operator/config/samples/vllm/ray-service.vllm.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ spec:
2020
env_vars:
2121
MODEL_ID: "meta-llama/Meta-Llama-3-8B-Instruct"
2222
TENSOR_PARALLELISM: "2"
23+
PIPELINE_PARALLELISM: "1"
2324
rayClusterConfig:
2425
headGroupSpec:
2526
rayStartParams:

ray-operator/config/samples/vllm/serve.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,4 +122,4 @@ def build_app(cli_args: Dict[str, str]) -> serve.Application:
122122

123123

124124
model = build_app(
125-
{"model": os.environ['MODEL_ID'], "tensor-parallel-size": os.environ['TENSOR_PARALLELISM']})
125+
{"model": os.environ['MODEL_ID'], "tensor-parallel-size": os.environ['TENSOR_PARALLELISM'], "pipeline-parallel-size": os.environ['PIPELINE_PARALLELISM']})

0 commit comments

Comments
 (0)