You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For detailed installation instructions and requirements, see the [Installation Guide](https://github.com/vllm-project/guidellm/blob/main/docs/install.md).
54
54
55
+
### With Podman / Docker
56
+
57
+
Alternatively we publish container images at [ghcr.io/vllm-project/guidellm](https://github.com/vllm-project/guidellm/pkgs/container/guidellm). Running a container is (by default) equivalent to `guidellm benchmark run`:
> [!TIP] CLI options can also be specified as ENV variables (E.g. `--rate-type sweep` -> `GUIDELLM_RATE_TYPE=sweep`). If both are specified then the CLI option overrides the the ENV.
71
+
72
+
Replace `latest` with `stable` for the newest tagged release or set a specific release if desired.
73
+
55
74
### Quick Start
56
75
57
76
#### 1. Start an OpenAI Compatible Server (vLLM)
@@ -159,11 +178,19 @@ GuideLLM UI is a companion frontend for visualizing the results of a GuideLLM be
159
178
160
179
### 🛠 Generating an HTML report with a benchmark run
161
180
181
+
For either pathway below you'll need to set the output path to benchmarks.html for your run:
182
+
183
+
```bash
184
+
--output-path=benchmarks.html
185
+
```
186
+
187
+
Alternatively load a saved run using the from-file command and also set the output to benchmarks.html
188
+
162
189
1. Use the Hosted Build (Recommended for Most Users)
163
190
164
191
This is preconfigured. The latest stable version of the hosted UI (https://blog.vllm.ai/guidellm/ui/latest) will be used to build the local html file.
165
192
166
-
Open benchmarks.html in your browser and you're done—no setup required.
193
+
Execute your run, then open benchmarks.html in your browser and you're done—no further setup required.
167
194
168
195
2. Build and Serve the UI Locally (For Development) This option is useful if:
169
196
@@ -187,14 +214,6 @@ export GUIDELLM__ENV=local
187
214
188
215
Then you can execute your run.
189
216
190
-
Set the output to benchmarks.html for your run:
191
-
192
-
```bash
193
-
--output-path=benchmarks.html
194
-
```
195
-
196
-
Alternatively load a saved run using the from-file command and also set the output to benchmarks.html
This page provides a reference for the `guidellm` command-line interface. For more advanced configuration, including environment variables and `.env` files, see the [Configuration Guide](./configuration.md).
4
+
5
+
## `guidellm benchmark run`
6
+
7
+
This command is the primary entrypoint for running benchmarks. It has many options that can be specified on the command line or in a scenario file.
|`--scenario <PATH or NAME>`| The name of a builtin scenario or path to a scenario configuration file. Options specified on the command line will override the scenario file. |
14
+
15
+
### Target and Backend Configuration
16
+
17
+
These options configure how `guidellm` connects to the system under test.
|`--target <URL>`|**Required.** The endpoint of the target system, e.g., `http://localhost:8080`. Can also be set with the `GUIDELLM__OPENAI__BASE_URL` environment variable. |
22
+
|`--backend-type <TYPE>`| The type of backend to use. Defaults to `openai_http`. |
23
+
|`--backend-args <JSON>`| A JSON string for backend-specific arguments. For example: `--backend-args '{"headers": {"Authorization": "Bearer my-token"}, "verify": false}'` to pass custom headers and disable certificate verification. |
24
+
|`--model <NAME>`| The ID of the model to benchmark within the backend. |
25
+
26
+
### Data and Request Configuration
27
+
28
+
These options define the data to be used for benchmarking and how requests will be generated.
|`--data <SOURCE>`| The data source. This can be a HuggingFace dataset ID, a path to a local data file, or a synthetic data configuration. See the [Data Formats Guide](./data_formats.md) for more details. |
33
+
|`--rate-type <TYPE>`| The type of request generation strategy to use (e.g., `constant`, `poisson`, `sweep`). |
34
+
|`--rate <NUMBER>`| The rate of requests per second for `constant` or `poisson` strategies, or the number of steps for a `sweep`. |
35
+
|`--max-requests <NUMBER>`| The maximum number of requests to run for each benchmark. |
36
+
|`--max-seconds <NUMBER>`| The maximum number of seconds to run each benchmark for. |
0 commit comments