Skip to content

Commit 518b118

Browse files
authored
Fix vLLM Gemma, add vLLM extra, fix getting throughput (#36451)
1 parent a3d42ae commit 518b118

File tree

8 files changed

+17
-27
lines changed

8 files changed

+17
-27
lines changed

.github/workflows/load-tests-pipeline-options/beam_Inference_Python_Benchmarks_Dataflow_Pytorch_Sentiment_Streaming_DistilBert_Base_Uncased.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,5 +31,6 @@
3131
--device=CPU
3232
--input_file=gs://apache-beam-ml/testing/inputs/sentences_50k.txt
3333
--runner=DataflowRunner
34+
--dataflow_service_options=worker_accelerator=type:nvidia-tesla-t4;count:1;install-nvidia-driver
3435
--model_path=distilbert-base-uncased-finetuned-sst-2-english
35-
--model_state_dict_path=gs://apache-beam-ml/models/huggingface.sentiment.distilbert-base-uncased.pth
36+
--model_state_dict_path=gs://apache-beam-ml/models/huggingface.sentiment.distilbert-base-uncased.pth

.github/workflows/load-tests-pipeline-options/beam_Inference_Python_Benchmarks_Dataflow_VLLM_Gemma_Batch.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
--input=gs://apache-beam-ml/testing/inputs/sentences_50k.txt
2121
--machine_type=n1-standard-8
2222
--worker_zone=us-central1-b
23-
--disk_size_gb=50
23+
--disk_size_gb=200
2424
--input_options={}
2525
--num_workers=8
2626
--max_num_workers=25
@@ -33,4 +33,4 @@
3333
--influx_measurement=gemma_vllm_batch
3434
--model_gcs_path=gs://apache-beam-ml/models/gemma-2b-it
3535
--dataflow_service_options=worker_accelerator=type:nvidia-tesla-t4;count:1;install-nvidia-driver
36-
--experiments=use_runner_v2
36+
--experiments=use_runner_v2

.github/workflows/refresh_looker_metrics.yml

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,19 +19,13 @@ name: Refresh Looker Performance Metrics
1919

2020
on:
2121
workflow_dispatch:
22-
inputs:
23-
READ_ONLY:
24-
description: 'Run in read-only mode'
25-
required: false
26-
default: 'true'
2722

2823
env:
2924
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
3025
LOOKERSDK_BASE_URL: ${{ secrets.LOOKERSDK_BASE_URL }}
3126
LOOKERSDK_CLIENT_ID: ${{ secrets.LOOKERSDK_CLIENT_ID }}
3227
LOOKERSDK_CLIENT_SECRET: ${{ secrets.LOOKERSDK_CLIENT_SECRET }}
3328
GCS_BUCKET: 'public_looker_explores_us_a3853f40'
34-
READ_ONLY: ${{ inputs.READ_ONLY }}
3529

3630
jobs:
3731
refresh_looker_metrics:

sdks/python/apache_beam/ml/inference/test_resources/vllm.dockerfile

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -46,23 +46,17 @@ RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3 && \
4646
python3 -m pip install --upgrade pip setuptools wheel
4747

4848
# 4) Copy the Beam SDK harness (for Dataflow workers)
49-
COPY --from=gcr.io/apache-beam-testing/beam-sdk/beam_python3.10_sdk:2.68.0.dev \
49+
COPY --from=gcr.io/apache-beam-testing/beam-sdk/beam_python3.10_sdk:latest \
5050
/opt/apache/beam /opt/apache/beam
5151

5252
# 5) Make sure the harness is discovered first
5353
ENV PYTHONPATH=/opt/apache/beam:$PYTHONPATH
5454

5555
# 6) Install the Beam dev SDK from the local source package.
5656
# This .tar.gz file will be created by GitHub Actions workflow
57-
# and copied into the build context.
57+
# and copied into the build context. This will include vLLM dependencies
5858
COPY ./sdks/python/build/apache-beam.tar.gz /tmp/beam.tar.gz
59-
RUN python3 -m pip install --no-cache-dir "/tmp/beam.tar.gz[gcp]"
60-
61-
# 7) Install vLLM, and other dependencies
62-
RUN python3 -m pip install --no-cache-dir \
63-
openai>=1.52.2 \
64-
vllm>=0.6.3 \
65-
triton>=3.1.0
59+
RUN python3 -m pip install --no-cache-dir "/tmp/beam.tar.gz[gcp,vllm]"
6660

6761
# 8) Use the Beam boot script as entrypoint
68-
ENTRYPOINT ["/opt/apache/beam/boot"]
62+
ENTRYPOINT ["/opt/apache/beam/boot"]

sdks/python/apache_beam/ml/inference/vllm_tests_requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@ torchvision>=0.8.2
1919
pillow>=8.0.0
2020
transformers>=4.18.0
2121
google-cloud-monitoring>=2.27.0
22-
openai>=1.52.2
22+
openai>=1.52.2

sdks/python/apache_beam/testing/benchmarks/inference/vllm_gemma_benchmarks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def __init__(self):
2626
self.metrics_namespace = "BeamML_vLLM"
2727
super().__init__(
2828
metrics_namespace=self.metrics_namespace,
29-
pcollection="WriteBQ.out0",
29+
pcollection="FormatForBQ.out0",
3030
)
3131

3232
def test(self):

sdks/python/setup.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -616,7 +616,8 @@ def get_portability_package_data():
616616
],
617617
'xgboost': ['xgboost>=1.6.0,<2.1.3', 'datatable==1.0.0'],
618618
'tensorflow-hub': ['tensorflow-hub>=0.14.0,<0.16.0'],
619-
'milvus': milvus_dependency
619+
'milvus': milvus_dependency,
620+
'vllm': ['openai==1.107.1', 'vllm==0.10.1.1', 'triton==3.3.1']
620621
},
621622
zip_safe=False,
622623
# PyPI package information.

website/www/site/data/performance.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -238,15 +238,15 @@ looks:
238238
write:
239239
folder: 86
240240
cost:
241-
- id: tJWFWW3cnF2CWpmK2zZdXGvWmtNnJgrC
241+
- id: J5TtpRykjwPs4W6S88FnJ28Tr8sSHpqN
242242
title: RunTime and EstimatedCost
243243
date:
244-
- id: J5TtpRykjwPs4W6S88FnJ28Tr8sSHpqN
244+
- id: tJWFWW3cnF2CWpmK2zZdXGvWmtNnJgrC
245245
title: AvgThroughputBytesPerSec by Date
246246
- id: Jf6qGqN25Zf787DpkNDX5CBpGRvCGMXp
247247
title: AvgThroughputElementsPerSec by Date
248248
version:
249-
- id: dKyJy5ZKhkBdSTXRY3wZR6fXzptSs2qm
250-
title: AvgThroughputBytesPerSec by Version
251249
- id: Qwxm27qY4fqT4CxXsFfKm2g3734TFJNN
252-
title: AvgThroughputElementsPerSec by Version
250+
title: AvgThroughputBytesPerSec by Version
251+
- id: dKyJy5ZKhkBdSTXRY3wZR6fXzptSs2qm
252+
title: AvgThroughputElementsPerSec by Version

0 commit comments

Comments
 (0)