Skip to content

Commit 86d0147

Browse files
authored
Merge branch 'main' into remove-pyhumps
2 parents cd0d445 + 1261fe8 commit 86d0147

31 files changed

+1309
-987
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
name: Bug report
3+
about: Create a report to help us improve
4+
labels: bug
5+
6+
---
7+
8+
**Describe the bug**
9+
A clear and concise description of what the bug is.
10+
11+
**Expected behavior**
12+
A clear and concise description of what you expected to happen.
13+
14+
**Environment**
15+
Include all relevant environment information:
16+
1. OS [e.g. Ubuntu 20.04]:
17+
2. Python version [e.g. 3.12.2]:
18+
19+
**To Reproduce**
20+
Exact steps to reproduce the behavior:
21+
22+
23+
**Errors**
24+
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
25+
26+
**Additional context**
27+
Add any other context about the problem here. Also include any relevant files.

.github/ISSUE_TEMPLATE/doc-edit.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
---
2+
name: Doc edit
3+
about: Propose changes to project documentation
4+
labels: documentation
5+
6+
---
7+
8+
**What is the URL, file, or UI containing proposed doc change**
9+
Where does one find the original content or where would this change go?
10+
11+
**What is the current content or situation in question**
12+
Copy/paste the source content or describe gap.
13+
14+
**What is the proposed change**
15+
Add new content.
16+
17+
**Additional context**
18+
Add any other context about the change here. Also include any relevant files or URLs.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
---
2+
name: Feature request
3+
about: Suggest an idea for this project
4+
labels: enhancement
5+
6+
---
7+
8+
**Is your feature request related to a problem? Please describe.**
9+
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
10+
11+
**Describe the solution you'd like**
12+
A clear and concise description of what you want to happen.
13+
14+
**Describe alternatives you've considered**
15+
A clear and concise description of any alternative solutions or features you've considered.
16+
17+
**Additional context**
18+
Add any other context or screenshots about the feature request here.

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
## Summary
2+
3+
<!--
4+
Include a short paragraph of the changes introduced in this PR.
5+
If this PR requires additional context or rationale, explain why
6+
the changes are necessary.
7+
-->
8+
9+
## Details
10+
11+
<!--
12+
Provide a detailed list of all changes introduced in this pull request.
13+
-->
14+
- [ ]
15+
16+
## Test Plan
17+
18+
<!--
19+
List the steps needed to test this PR.
20+
-->
21+
-
22+
23+
## Related Issues
24+
25+
<!--
26+
Link any relevant issues that this PR addresses.
27+
-->
28+
- Resolves #
29+
30+
---
31+
32+
- [ ] "I certify that all code in this PR is my own, except as noted below."
33+
34+
## Use of AI
35+
36+
- [ ] Includes AI-assisted code completion
37+
- [ ] Includes code generated by an AI application
38+
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)

.github/workflows/development.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ jobs:
124124
- name: Install dependencies
125125
run: pip install tox
126126
- name: Run unit tests
127-
run: tox -e test-unit -- -m "smoke or sanity"
127+
run: tox -e test-unit
128128

129129
ui-unit-tests:
130130
permissions:

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ jobs:
125125
- name: Install dependencies
126126
run: pip install tox
127127
- name: Run unit tests
128-
run: tox -e test-unit -- -m "smoke or sanity"
128+
run: tox -e test-unit
129129

130130
ui-unit-tests:
131131
permissions:

README.md

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,25 @@ pip install git+https://github.com/vllm-project/guidellm.git
5252

5353
For detailed installation instructions and requirements, see the [Installation Guide](https://github.com/vllm-project/guidellm/blob/main/docs/install.md).
5454

55+
### With Podman / Docker
56+
57+
Alternatively we publish container images at [ghcr.io/vllm-project/guidellm](https://github.com/vllm-project/guidellm/pkgs/container/guidellm). Running a container is (by default) equivalent to `guidellm benchmark run`:
58+
59+
```bash
60+
podman run \
61+
--rm -it \
62+
-v "./results:/results:rw" \
63+
-e GUIDELLM_TARGET=http://localhost:8000 \
64+
-e GUIDELLM_RATE_TYPE=sweep \
65+
-e GUIDELLM_MAX_SECONDS=30 \
66+
-e GUIDELLM_DATA="prompt_tokens=256,output_tokens=128" \
67+
ghcr.io/vllm-project/guidellm:latest
68+
```
69+
70+
> [!TIP] CLI options can also be specified as ENV variables (E.g. `--rate-type sweep` -> `GUIDELLM_RATE_TYPE=sweep`). If both are specified then the CLI option overrides the the ENV.
71+
72+
Replace `latest` with `stable` for the newest tagged release or set a specific release if desired.
73+
5574
### Quick Start
5675

5776
#### 1. Start an OpenAI Compatible Server (vLLM)
@@ -159,11 +178,19 @@ GuideLLM UI is a companion frontend for visualizing the results of a GuideLLM be
159178

160179
### 🛠 Generating an HTML report with a benchmark run
161180

181+
For either pathway below you'll need to set the output path to benchmarks.html for your run:
182+
183+
```bash
184+
--output-path=benchmarks.html
185+
```
186+
187+
Alternatively load a saved run using the from-file command and also set the output to benchmarks.html
188+
162189
1. Use the Hosted Build (Recommended for Most Users)
163190

164191
This is preconfigured. The latest stable version of the hosted UI (https://blog.vllm.ai/guidellm/ui/latest) will be used to build the local html file.
165192

166-
Open benchmarks.html in your browser and you're done—no setup required.
193+
Execute your run, then open benchmarks.html in your browser and you're done—no further setup required.
167194

168195
2. Build and Serve the UI Locally (For Development) This option is useful if:
169196

@@ -187,14 +214,6 @@ export GUIDELLM__ENV=local
187214

188215
Then you can execute your run.
189216

190-
Set the output to benchmarks.html for your run:
191-
192-
```bash
193-
--output-path=benchmarks.html
194-
```
195-
196-
Alternatively load a saved run using the from-file command and also set the output to benchmarks.html
197-
198217
## Resources
199218

200219
### Documentation

deploy/Containerfile

Lines changed: 16 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,26 @@
1-
ARG PYTHON=3.13
1+
ARG BASE_IMAGE=docker.io/python:3.13-slim
22

33
# Use a multi-stage build to create a lightweight production image
4-
FROM docker.io/python:${PYTHON}-slim as builder
4+
FROM $BASE_IMAGE as builder
5+
6+
# Ensure files are installed as root
7+
USER root
58

69
# Copy repository files
7-
COPY / /src
10+
COPY / /opt/app-root/src
811

912
# Create a venv and install guidellm
10-
RUN python3 -m venv /opt/guidellm \
11-
&& /opt/guidellm/bin/pip install --no-cache-dir /src
12-
13-
# Copy entrypoint script into the venv bin directory
14-
RUN install -m0755 /src/deploy/entrypoint.sh /opt/guidellm/bin/entrypoint.sh
13+
RUN python3 -m venv /opt/app-root/guidellm \
14+
&& /opt/app-root/guidellm/bin/pip install --no-cache-dir /opt/app-root/src
1515

1616
# Prod image
17-
FROM docker.io/python:${PYTHON}-slim
17+
FROM $BASE_IMAGE
1818

1919
# Copy the virtual environment from the builder stage
20-
COPY --from=builder /opt/guidellm /opt/guidellm
20+
COPY --from=builder /opt/app-root/guidellm /opt/app-root/guidellm
2121

2222
# Add guidellm bin to PATH
23-
ENV PATH="/opt/guidellm/bin:$PATH"
23+
ENV PATH="/opt/app-root/guidellm/bin:$PATH"
2424

2525
# Create a non-root user
2626
RUN useradd -md /results guidellm
@@ -35,14 +35,8 @@ WORKDIR /results
3535
LABEL org.opencontainers.image.source="https://github.com/vllm-project/guidellm" \
3636
org.opencontainers.image.description="GuideLLM Performance Benchmarking Container"
3737

38-
# Set the environment variable for the benchmark script
39-
# TODO: Replace with scenario environment variables
40-
ENV GUIDELLM_TARGET="http://localhost:8000" \
41-
GUIDELLM_MODEL="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16" \
42-
GUIDELLM_RATE_TYPE="sweep" \
43-
GUIDELLM_DATA="prompt_tokens=256,output_tokens=128" \
44-
GUIDELLM_MAX_REQUESTS="100" \
45-
GUIDELLM_MAX_SECONDS="" \
46-
GUIDELLM_OUTPUT_PATH="/results/results.json"
47-
48-
ENTRYPOINT [ "/opt/guidellm/bin/entrypoint.sh" ]
38+
# Argument defaults can be set with GUIDELLM_<ARG>
39+
ENV GUIDELLM_OUTPUT_PATH="/results/benchmarks.json"
40+
41+
ENTRYPOINT [ "/opt/app-root/guidellm/bin/guidellm" ]
42+
CMD [ "benchmark", "run" ]

deploy/entrypoint.sh

Lines changed: 0 additions & 43 deletions
This file was deleted.

docs/guides/cli.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,36 @@
1-
# Coming Soon
1+
# CLI Reference
2+
3+
This page provides a reference for the `guidellm` command-line interface. For more advanced configuration, including environment variables and `.env` files, see the [Configuration Guide](./configuration.md).
4+
5+
## `guidellm benchmark run`
6+
7+
This command is the primary entrypoint for running benchmarks. It has many options that can be specified on the command line or in a scenario file.
8+
9+
### Scenario Configuration
10+
11+
| Option | Description |
12+
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
13+
| `--scenario <PATH or NAME>` | The name of a builtin scenario or path to a scenario configuration file. Options specified on the command line will override the scenario file. |
14+
15+
### Target and Backend Configuration
16+
17+
These options configure how `guidellm` connects to the system under test.
18+
19+
| Option | Description |
20+
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
21+
| `--target <URL>` | **Required.** The endpoint of the target system, e.g., `http://localhost:8080`. Can also be set with the `GUIDELLM__OPENAI__BASE_URL` environment variable. |
22+
| `--backend-type <TYPE>` | The type of backend to use. Defaults to `openai_http`. |
23+
| `--backend-args <JSON>` | A JSON string for backend-specific arguments. For example: `--backend-args '{"headers": {"Authorization": "Bearer my-token"}, "verify": false}'` to pass custom headers and disable certificate verification. |
24+
| `--model <NAME>` | The ID of the model to benchmark within the backend. |
25+
26+
### Data and Request Configuration
27+
28+
These options define the data to be used for benchmarking and how requests will be generated.
29+
30+
| Option | Description |
31+
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
32+
| `--data <SOURCE>` | The data source. This can be a HuggingFace dataset ID, a path to a local data file, or a synthetic data configuration. See the [Data Formats Guide](./data_formats.md) for more details. |
33+
| `--rate-type <TYPE>` | The type of request generation strategy to use (e.g., `constant`, `poisson`, `sweep`). |
34+
| `--rate <NUMBER>` | The rate of requests per second for `constant` or `poisson` strategies, or the number of steps for a `sweep`. |
35+
| `--max-requests <NUMBER>` | The maximum number of requests to run for each benchmark. |
36+
| `--max-seconds <NUMBER>` | The maximum number of seconds to run each benchmark for. |

0 commit comments

Comments
 (0)