Skip to content

Commit c77d076

Browse files
authored
Merge branch 'main' into update-ui-readme
2 parents ac8bff8 + 71f1e3c commit c77d076

File tree

17 files changed

+520
-89
lines changed

17 files changed

+520
-89
lines changed

.github/workflows/development.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ jobs:
124124
- name: Install dependencies
125125
run: pip install tox
126126
- name: Run unit tests
127-
run: tox -e test-unit -- -m "smoke or sanity"
127+
run: tox -e test-unit
128128

129129
ui-unit-tests:
130130
permissions:

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ jobs:
125125
- name: Install dependencies
126126
run: pip install tox
127127
- name: Run unit tests
128-
run: tox -e test-unit -- -m "smoke or sanity"
128+
run: tox -e test-unit
129129

130130
ui-unit-tests:
131131
permissions:

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,25 @@ pip install git+https://github.com/vllm-project/guidellm.git
5252

5353
For detailed installation instructions and requirements, see the [Installation Guide](https://github.com/vllm-project/guidellm/blob/main/docs/install.md).
5454

55+
### With Podman / Docker
56+
57+
Alternatively we publish container images at [ghcr.io/vllm-project/guidellm](https://github.com/vllm-project/guidellm/pkgs/container/guidellm). Running a container is (by default) equivalent to `guidellm benchmark run`:
58+
59+
```bash
60+
podman run \
61+
--rm -it \
62+
-v "./results:/results:rw" \
63+
-e GUIDELLM_TARGET=http://localhost:8000 \
64+
-e GUIDELLM_RATE_TYPE=sweep \
65+
-e GUIDELLM_MAX_SECONDS=30 \
66+
-e GUIDELLM_DATA="prompt_tokens=256,output_tokens=128" \
67+
ghcr.io/vllm-project/guidellm:latest
68+
```
69+
70+
> [!TIP] CLI options can also be specified as ENV variables (E.g. `--rate-type sweep` -> `GUIDELLM_RATE_TYPE=sweep`). If both are specified then the CLI option overrides the the ENV.
71+
72+
Replace `latest` with `stable` for the newest tagged release or set a specific release if desired.
73+
5574
### Quick Start
5675

5776
#### 1. Start an OpenAI Compatible Server (vLLM)

deploy/Containerfile

Lines changed: 16 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,26 @@
1-
ARG PYTHON=3.13
1+
ARG BASE_IMAGE=docker.io/python:3.13-slim
22

33
# Use a multi-stage build to create a lightweight production image
4-
FROM docker.io/python:${PYTHON}-slim as builder
4+
FROM $BASE_IMAGE as builder
5+
6+
# Ensure files are installed as root
7+
USER root
58

69
# Copy repository files
7-
COPY / /src
10+
COPY / /opt/app-root/src
811

912
# Create a venv and install guidellm
10-
RUN python3 -m venv /opt/guidellm \
11-
&& /opt/guidellm/bin/pip install --no-cache-dir /src
12-
13-
# Copy entrypoint script into the venv bin directory
14-
RUN install -m0755 /src/deploy/entrypoint.sh /opt/guidellm/bin/entrypoint.sh
13+
RUN python3 -m venv /opt/app-root/guidellm \
14+
&& /opt/app-root/guidellm/bin/pip install --no-cache-dir /opt/app-root/src
1515

1616
# Prod image
17-
FROM docker.io/python:${PYTHON}-slim
17+
FROM $BASE_IMAGE
1818

1919
# Copy the virtual environment from the builder stage
20-
COPY --from=builder /opt/guidellm /opt/guidellm
20+
COPY --from=builder /opt/app-root/guidellm /opt/app-root/guidellm
2121

2222
# Add guidellm bin to PATH
23-
ENV PATH="/opt/guidellm/bin:$PATH"
23+
ENV PATH="/opt/app-root/guidellm/bin:$PATH"
2424

2525
# Create a non-root user
2626
RUN useradd -md /results guidellm
@@ -35,14 +35,8 @@ WORKDIR /results
3535
LABEL org.opencontainers.image.source="https://github.com/vllm-project/guidellm" \
3636
org.opencontainers.image.description="GuideLLM Performance Benchmarking Container"
3737

38-
# Set the environment variable for the benchmark script
39-
# TODO: Replace with scenario environment variables
40-
ENV GUIDELLM_TARGET="http://localhost:8000" \
41-
GUIDELLM_MODEL="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16" \
42-
GUIDELLM_RATE_TYPE="sweep" \
43-
GUIDELLM_DATA="prompt_tokens=256,output_tokens=128" \
44-
GUIDELLM_MAX_REQUESTS="100" \
45-
GUIDELLM_MAX_SECONDS="" \
46-
GUIDELLM_OUTPUT_PATH="/results/results.json"
47-
48-
ENTRYPOINT [ "/opt/guidellm/bin/entrypoint.sh" ]
38+
# Argument defaults can be set with GUIDELLM_<ARG>
39+
ENV GUIDELLM_OUTPUT_PATH="/results/benchmarks.json"
40+
41+
ENTRYPOINT [ "/opt/app-root/guidellm/bin/guidellm" ]
42+
CMD [ "benchmark", "run" ]

deploy/entrypoint.sh

Lines changed: 0 additions & 43 deletions
This file was deleted.

docs/guides/cli.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,36 @@
1-
# Coming Soon
1+
# CLI Reference
2+
3+
This page provides a reference for the `guidellm` command-line interface. For more advanced configuration, including environment variables and `.env` files, see the [Configuration Guide](./configuration.md).
4+
5+
## `guidellm benchmark run`
6+
7+
This command is the primary entrypoint for running benchmarks. It has many options that can be specified on the command line or in a scenario file.
8+
9+
### Scenario Configuration
10+
11+
| Option | Description |
12+
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
13+
| `--scenario <PATH or NAME>` | The name of a builtin scenario or path to a scenario configuration file. Options specified on the command line will override the scenario file. |
14+
15+
### Target and Backend Configuration
16+
17+
These options configure how `guidellm` connects to the system under test.
18+
19+
| Option | Description |
20+
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
21+
| `--target <URL>` | **Required.** The endpoint of the target system, e.g., `http://localhost:8080`. Can also be set with the `GUIDELLM__OPENAI__BASE_URL` environment variable. |
22+
| `--backend-type <TYPE>` | The type of backend to use. Defaults to `openai_http`. |
23+
| `--backend-args <JSON>` | A JSON string for backend-specific arguments. For example: `--backend-args '{"headers": {"Authorization": "Bearer my-token"}, "verify": false}'` to pass custom headers and disable certificate verification. |
24+
| `--model <NAME>` | The ID of the model to benchmark within the backend. |
25+
26+
### Data and Request Configuration
27+
28+
These options define the data to be used for benchmarking and how requests will be generated.
29+
30+
| Option | Description |
31+
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
32+
| `--data <SOURCE>` | The data source. This can be a HuggingFace dataset ID, a path to a local data file, or a synthetic data configuration. See the [Data Formats Guide](./data_formats.md) for more details. |
33+
| `--rate-type <TYPE>` | The type of request generation strategy to use (e.g., `constant`, `poisson`, `sweep`). |
34+
| `--rate <NUMBER>` | The rate of requests per second for `constant` or `poisson` strategies, or the number of steps for a `sweep`. |
35+
| `--max-requests <NUMBER>` | The maximum number of requests to run for each benchmark. |
36+
| `--max-seconds <NUMBER>` | The maximum number of seconds to run each benchmark for. |

docs/guides/configuration.md

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,59 @@
1-
# Coming Soon
1+
# Configuration
2+
3+
The `guidellm` application can be configured using command-line arguments, environment variables, or a `.env` file. This page details the file-based and environment variable configuration options.
4+
5+
## Configuration Methods
6+
7+
Settings are loaded with the following priority (highest priority first):
8+
9+
1. Command-line arguments.
10+
2. Environment variables.
11+
3. Values in a `.env` file in the directory where the command is run.
12+
4. Default values.
13+
14+
## Environment Variable Format
15+
16+
All settings can be configured using environment variables. The variables must be prefixed with `GUIDELLM__`, and nested settings are separated by a double underscore `__`.
17+
18+
For example, to set the `api_key` for the `openai` backend, you would use the following environment variable:
19+
20+
```bash
21+
export GUIDELLM__OPENAI__API_KEY="your-api-key"
22+
```
23+
24+
### Target and Backend Configuration
25+
26+
You can configure the connection to the target system using environment variables. This is an alternative to using the `--target-*` command-line flags.
27+
28+
| Environment Variable | Description | Example |
29+
| ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
30+
| `GUIDELLM__OPENAI__BASE_URL` | The endpoint of the target system. Equivalent to the `--target` CLI option. | `export GUIDELLM__OPENAI__BASE_URL="http://localhost:8080"` |
31+
| `GUIDELLM__OPENAI__API_KEY` | The API key to use for bearer token authentication. | `export GUIDELLM__OPENAI__API_KEY="your-secret-api-key"` |
32+
| `GUIDELLM__OPENAI__BEARER_TOKEN` | The full bearer token to use for authentication. | `export GUIDELLM__OPENAI__BEARER_TOKEN="Bearer your-secret-token"` |
33+
| `GUIDELLM__OPENAI__HEADERS` | A JSON string representing a dictionary of headers to send to the target. These headers will override any default headers. | `export GUIDELLM__OPENAI__HEADERS='{"Authorization": "Bearer my-token"}'` |
34+
| `GUIDELLM__OPENAI__ORGANIZATION` | The OpenAI organization to use for requests. | `export GUIDELLM__OPENAI__ORGANIZATION="org-12345"` |
35+
| `GUIDELLM__OPENAI__PROJECT` | The OpenAI project to use for requests. | `export GUIDELLM__OPENAI__PROJECT="proj-67890"` |
36+
| `GUIDELLM__OPENAI__VERIFY` | Set to `false` or `0` to disable certificate verification. | `export GUIDELLM__OPENAI__VERIFY=false` |
37+
| `GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS` | The default maximum number of tokens to request for completions. | `export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=2048` |
38+
39+
### General HTTP Settings
40+
41+
These settings control the behavior of the underlying HTTP client.
42+
43+
| Environment Variable | Description |
44+
| ------------------------------------ | ------------------------------------------------------------------------------- |
45+
| `GUIDELLM__REQUEST_TIMEOUT` | The timeout in seconds for HTTP requests. Defaults to 300. |
46+
| `GUIDELLM__REQUEST_HTTP2` | Set to `true` or `1` to enable HTTP/2 support. Defaults to true. |
47+
| `GUIDELLM__REQUEST_FOLLOW_REDIRECTS` | Set to `true` or `1` to allow the client to follow redirects. Defaults to true. |
48+
49+
### Using a `.env` file
50+
51+
You can also place these variables in a `.env` file in your project's root directory:
52+
53+
```dotenv
54+
# .env file
55+
GUIDELLM__OPENAI__BASE_URL="http://localhost:8080"
56+
GUIDELLM__OPENAI__API_KEY="your-api-key"
57+
GUIDELLM__OPENAI__HEADERS='{"Authorization": "Bearer my-token"}'
58+
GUIDELLM__OPENAI__VERIFY=false
59+
```

docs/guides/data_formats.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Data Formats
2+
3+
The `--data` argument for the `guidellm benchmark run` command accepts several different formats for specifying the data to be used for benchmarking.
4+
5+
## Local Data Files
6+
7+
You can provide a path to a local data file in one of the following formats:
8+
9+
- **CSV (.csv)**: A comma-separated values file. The loader will attempt to find a column with a common name for the prompt (e.g., `prompt`, `text`, `instruction`).
10+
- **JSON (.json)**: A JSON file. The structure should be a list of objects, where each object represents a row of data.
11+
- **JSON Lines (.jsonl)**: A file where each line is a valid JSON object.
12+
- **Text (.txt)**: A plain text file, where each line is treated as a separate prompt.
13+
14+
If the prompt column cannot be automatically determined, you can specify it using the `--data-args` option:
15+
16+
```bash
17+
--data-args '{"text_column": "my_custom_prompt_column"}'
18+
```
19+
20+
## Synthetic Data
21+
22+
You can generate synthetic data on the fly by providing a configuration string or file.
23+
24+
### Configuration Options
25+
26+
| Parameter | Description |
27+
| --------------------- | --------------------------------------------------------------------------------------------------------------- |
28+
| `prompt_tokens` | **Required.** The average number of tokens for the generated prompts. |
29+
| `output_tokens` | **Required.** The average number of tokens for the generated outputs. |
30+
| `samples` | The total number of samples to generate. Defaults to 1000. |
31+
| `source` | The source text to use for generating the synthetic data. Defaults to a built-in copy of "Pride and Prejudice". |
32+
| `prompt_tokens_stdev` | The standard deviation of the tokens generated for prompts. |
33+
| `prompt_tokens_min` | The minimum number of text tokens generated for prompts. |
34+
| `prompt_tokens_max` | The maximum number of text tokens generated for prompts. |
35+
| `output_tokens_stdev` | The standard deviation of the tokens generated for outputs. |
36+
| `output_tokens_min` | The minimum number of text tokens generated for outputs. |
37+
| `output_tokens_max` | The maximum number of text tokens generated for outputs. |
38+
39+
### Configuration Formats
40+
41+
You can provide the synthetic data configuration in one of three ways:
42+
43+
1. **Key-Value String:**
44+
45+
```bash
46+
--data "prompt_tokens=256,output_tokens=128,samples=500"
47+
```
48+
49+
2. **JSON String:**
50+
51+
```bash
52+
--data '{"prompt_tokens": 256, "output_tokens": 128, "samples": 500}'
53+
```
54+
55+
3. **YAML or Config File:** Create a file (e.g., `my_config.yaml`):
56+
57+
```yaml
58+
prompt_tokens: 256
59+
output_tokens: 128
60+
samples: 500
61+
```
62+
63+
And use it with the `--data` argument:
64+
65+
```bash
66+
--data my_config.yaml
67+
```

src/guidellm/__main__.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525

2626

2727
@click.group()
28+
@click.version_option(package_name="guidellm", message="guidellm version: %(version)s")
2829
def cli():
2930
pass
3031

@@ -51,7 +52,7 @@ def benchmark():
5152
readable=True,
5253
file_okay=True,
5354
dir_okay=False,
54-
path_type=Path, # type: ignore[type-var]
55+
path_type=Path,
5556
),
5657
click.Choice(get_builtin_scenarios()),
5758
),
@@ -82,7 +83,9 @@ def benchmark():
8283
default=GenerativeTextScenario.get_default("backend_args"),
8384
help=(
8485
"A JSON string containing any arguments to pass to the backend as a "
85-
"dict with **kwargs."
86+
"dict with **kwargs. Headers can be removed by setting their value to "
87+
"null. For example: "
88+
"""'{"headers": {"Authorization": null, "Custom-Header": "Custom-Value"}}'"""
8689
),
8790
)
8891
@click.option(

0 commit comments

Comments
 (0)