Skip to content

Commit bc49ac8

Browse files
authored
Add CLI options for backend args (like headers and verify) (#230)
This PR adds the ability to configure custom request headers and control SSL certificate verification when running benchmarks. * The OpenAIHTTPBackend now supports passing custom headers and a verify flag to disable SSL verification. * Headers are now merged with the following precedence: CLI arguments (--backend-args), scenario file arguments, environment variables, and then default values. * Headers can be removed by setting their value to null in the --backend-args JSON string. * The --backend-args help text has been updated with an example of how to use these new features. * New documentation has been added for the CLI, configuration options, and supported data formats. * Unit tests have been added to verify the new header and SSL verification logic, as well as the CLI argument parsing. This provides a way to benchmark targets that require custom authentication, other headers, or use self-signed SSL certificates. Signed-off-by: Elijah DeLee <[email protected]>
1 parent 72374ef commit bc49ac8

File tree

10 files changed

+373
-21
lines changed

10 files changed

+373
-21
lines changed

docs/guides/cli.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,36 @@
1-
# Coming Soon
1+
# CLI Reference
2+
3+
This page provides a reference for the `guidellm` command-line interface. For more advanced configuration, including environment variables and `.env` files, see the [Configuration Guide](./configuration.md).
4+
5+
## `guidellm benchmark run`
6+
7+
This command is the primary entrypoint for running benchmarks. It has many options that can be specified on the command line or in a scenario file.
8+
9+
### Scenario Configuration
10+
11+
| Option | Description |
12+
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
13+
| `--scenario <PATH or NAME>` | The name of a builtin scenario or path to a scenario configuration file. Options specified on the command line will override the scenario file. |
14+
15+
### Target and Backend Configuration
16+
17+
These options configure how `guidellm` connects to the system under test.
18+
19+
| Option | Description |
20+
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
21+
| `--target <URL>` | **Required.** The endpoint of the target system, e.g., `http://localhost:8080`. Can also be set with the `GUIDELLM__OPENAI__BASE_URL` environment variable. |
22+
| `--backend-type <TYPE>` | The type of backend to use. Defaults to `openai_http`. |
23+
| `--backend-args <JSON>` | A JSON string for backend-specific arguments. For example: `--backend-args '{"headers": {"Authorization": "Bearer my-token"}, "verify": false}'` to pass custom headers and disable certificate verification. |
24+
| `--model <NAME>` | The ID of the model to benchmark within the backend. |
25+
26+
### Data and Request Configuration
27+
28+
These options define the data to be used for benchmarking and how requests will be generated.
29+
30+
| Option | Description |
31+
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
32+
| `--data <SOURCE>` | The data source. This can be a HuggingFace dataset ID, a path to a local data file, or a synthetic data configuration. See the [Data Formats Guide](./data_formats.md) for more details. |
33+
| `--rate-type <TYPE>` | The type of request generation strategy to use (e.g., `constant`, `poisson`, `sweep`). |
34+
| `--rate <NUMBER>` | The rate of requests per second for `constant` or `poisson` strategies, or the number of steps for a `sweep`. |
35+
| `--max-requests <NUMBER>` | The maximum number of requests to run for each benchmark. |
36+
| `--max-seconds <NUMBER>` | The maximum number of seconds to run each benchmark for. |

docs/guides/configuration.md

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,59 @@
1-
# Coming Soon
1+
# Configuration
2+
3+
The `guidellm` application can be configured using command-line arguments, environment variables, or a `.env` file. This page details the file-based and environment variable configuration options.
4+
5+
## Configuration Methods
6+
7+
Settings are loaded with the following priority (highest priority first):
8+
9+
1. Command-line arguments.
10+
2. Environment variables.
11+
3. Values in a `.env` file in the directory where the command is run.
12+
4. Default values.
13+
14+
## Environment Variable Format
15+
16+
All settings can be configured using environment variables. The variables must be prefixed with `GUIDELLM__`, and nested settings are separated by a double underscore `__`.
17+
18+
For example, to set the `api_key` for the `openai` backend, you would use the following environment variable:
19+
20+
```bash
21+
export GUIDELLM__OPENAI__API_KEY="your-api-key"
22+
```
23+
24+
### Target and Backend Configuration
25+
26+
You can configure the connection to the target system using environment variables. This is an alternative to using the `--target-*` command-line flags.
27+
28+
| Environment Variable | Description | Example |
29+
| ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
30+
| `GUIDELLM__OPENAI__BASE_URL` | The endpoint of the target system. Equivalent to the `--target` CLI option. | `export GUIDELLM__OPENAI__BASE_URL="http://localhost:8080"` |
31+
| `GUIDELLM__OPENAI__API_KEY` | The API key to use for bearer token authentication. | `export GUIDELLM__OPENAI__API_KEY="your-secret-api-key"` |
32+
| `GUIDELLM__OPENAI__BEARER_TOKEN` | The full bearer token to use for authentication. | `export GUIDELLM__OPENAI__BEARER_TOKEN="Bearer your-secret-token"` |
33+
| `GUIDELLM__OPENAI__HEADERS` | A JSON string representing a dictionary of headers to send to the target. These headers will override any default headers. | `export GUIDELLM__OPENAI__HEADERS='{"Authorization": "Bearer my-token"}'` |
34+
| `GUIDELLM__OPENAI__ORGANIZATION` | The OpenAI organization to use for requests. | `export GUIDELLM__OPENAI__ORGANIZATION="org-12345"` |
35+
| `GUIDELLM__OPENAI__PROJECT` | The OpenAI project to use for requests. | `export GUIDELLM__OPENAI__PROJECT="proj-67890"` |
36+
| `GUIDELLM__OPENAI__VERIFY` | Set to `false` or `0` to disable certificate verification. | `export GUIDELLM__OPENAI__VERIFY=false` |
37+
| `GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS` | The default maximum number of tokens to request for completions. | `export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=2048` |
38+
39+
### General HTTP Settings
40+
41+
These settings control the behavior of the underlying HTTP client.
42+
43+
| Environment Variable | Description |
44+
| ------------------------------------ | ------------------------------------------------------------------------------- |
45+
| `GUIDELLM__REQUEST_TIMEOUT` | The timeout in seconds for HTTP requests. Defaults to 300. |
46+
| `GUIDELLM__REQUEST_HTTP2` | Set to `true` or `1` to enable HTTP/2 support. Defaults to true. |
47+
| `GUIDELLM__REQUEST_FOLLOW_REDIRECTS` | Set to `true` or `1` to allow the client to follow redirects. Defaults to true. |
48+
49+
### Using a `.env` file
50+
51+
You can also place these variables in a `.env` file in your project's root directory:
52+
53+
```dotenv
54+
# .env file
55+
GUIDELLM__OPENAI__BASE_URL="http://localhost:8080"
56+
GUIDELLM__OPENAI__API_KEY="your-api-key"
57+
GUIDELLM__OPENAI__HEADERS='{"Authorization": "Bearer my-token"}'
58+
GUIDELLM__OPENAI__VERIFY=false
59+
```

docs/guides/data_formats.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Data Formats
2+
3+
The `--data` argument for the `guidellm benchmark run` command accepts several different formats for specifying the data to be used for benchmarking.
4+
5+
## Local Data Files
6+
7+
You can provide a path to a local data file in one of the following formats:
8+
9+
- **CSV (.csv)**: A comma-separated values file. The loader will attempt to find a column with a common name for the prompt (e.g., `prompt`, `text`, `instruction`).
10+
- **JSON (.json)**: A JSON file. The structure should be a list of objects, where each object represents a row of data.
11+
- **JSON Lines (.jsonl)**: A file where each line is a valid JSON object.
12+
- **Text (.txt)**: A plain text file, where each line is treated as a separate prompt.
13+
14+
If the prompt column cannot be automatically determined, you can specify it using the `--data-args` option:
15+
16+
```bash
17+
--data-args '{"text_column": "my_custom_prompt_column"}'
18+
```
19+
20+
## Synthetic Data
21+
22+
You can generate synthetic data on the fly by providing a configuration string or file.
23+
24+
### Configuration Options
25+
26+
| Parameter | Description |
27+
| --------------------- | --------------------------------------------------------------------------------------------------------------- |
28+
| `prompt_tokens` | **Required.** The average number of tokens for the generated prompts. |
29+
| `output_tokens` | **Required.** The average number of tokens for the generated outputs. |
30+
| `samples` | The total number of samples to generate. Defaults to 1000. |
31+
| `source` | The source text to use for generating the synthetic data. Defaults to a built-in copy of "Pride and Prejudice". |
32+
| `prompt_tokens_stdev` | The standard deviation of the tokens generated for prompts. |
33+
| `prompt_tokens_min` | The minimum number of text tokens generated for prompts. |
34+
| `prompt_tokens_max` | The maximum number of text tokens generated for prompts. |
35+
| `output_tokens_stdev` | The standard deviation of the tokens generated for outputs. |
36+
| `output_tokens_min` | The minimum number of text tokens generated for outputs. |
37+
| `output_tokens_max` | The maximum number of text tokens generated for outputs. |
38+
39+
### Configuration Formats
40+
41+
You can provide the synthetic data configuration in one of three ways:
42+
43+
1. **Key-Value String:**
44+
45+
```bash
46+
--data "prompt_tokens=256,output_tokens=128,samples=500"
47+
```
48+
49+
2. **JSON String:**
50+
51+
```bash
52+
--data '{"prompt_tokens": 256, "output_tokens": 128, "samples": 500}'
53+
```
54+
55+
3. **YAML or Config File:** Create a file (e.g., `my_config.yaml`):
56+
57+
```yaml
58+
prompt_tokens: 256
59+
output_tokens: 128
60+
samples: 500
61+
```
62+
63+
And use it with the `--data` argument:
64+
65+
```bash
66+
--data my_config.yaml
67+
```

src/guidellm/__main__.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,9 @@ def benchmark():
8282
default=GenerativeTextScenario.get_default("backend_args"),
8383
help=(
8484
"A JSON string containing any arguments to pass to the backend as a "
85-
"dict with **kwargs."
85+
"dict with **kwargs. Headers can be removed by setting their value to "
86+
"null. For example: "
87+
"""'{"headers": {"Authorization": null, "Custom-Header": "Custom-Value"}}'"""
8688
),
8789
)
8890
@click.option(

src/guidellm/backend/openai.py

Lines changed: 28 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,8 @@ def __init__(
9595
extra_query: Optional[dict] = None,
9696
extra_body: Optional[dict] = None,
9797
remove_from_body: Optional[list[str]] = None,
98+
headers: Optional[dict] = None,
99+
verify: Optional[bool] = None,
98100
):
99101
super().__init__(type_="openai_http")
100102
self._target = target or settings.openai.base_url
@@ -111,20 +113,40 @@ def __init__(
111113

112114
self._model = model
113115

116+
# Start with default headers based on other params
117+
default_headers: dict[str, str] = {}
114118
api_key = api_key or settings.openai.api_key
115-
self.authorization = (
116-
f"Bearer {api_key}" if api_key else settings.openai.bearer_token
117-
)
119+
bearer_token = settings.openai.bearer_token
120+
if api_key:
121+
default_headers["Authorization"] = f"Bearer {api_key}"
122+
elif bearer_token:
123+
default_headers["Authorization"] = bearer_token
118124

119125
self.organization = organization or settings.openai.organization
126+
if self.organization:
127+
default_headers["OpenAI-Organization"] = self.organization
128+
120129
self.project = project or settings.openai.project
130+
if self.project:
131+
default_headers["OpenAI-Project"] = self.project
132+
133+
# User-provided headers from kwargs or settings override defaults
134+
merged_headers = default_headers.copy()
135+
merged_headers.update(settings.openai.headers or {})
136+
if headers:
137+
merged_headers.update(headers)
138+
139+
# Remove headers with None values for backward compatibility and convenience
140+
self.headers = {k: v for k, v in merged_headers.items() if v is not None}
141+
121142
self.timeout = timeout if timeout is not None else settings.request_timeout
122143
self.http2 = http2 if http2 is not None else settings.request_http2
123144
self.follow_redirects = (
124145
follow_redirects
125146
if follow_redirects is not None
126147
else settings.request_follow_redirects
127148
)
149+
self.verify = verify if verify is not None else settings.openai.verify
128150
self.max_output_tokens = (
129151
max_output_tokens
130152
if max_output_tokens is not None
@@ -161,9 +183,7 @@ def info(self) -> dict[str, Any]:
161183
"timeout": self.timeout,
162184
"http2": self.http2,
163185
"follow_redirects": self.follow_redirects,
164-
"authorization": bool(self.authorization),
165-
"organization": self.organization,
166-
"project": self.project,
186+
"headers": self.headers,
167187
"text_completions_path": TEXT_COMPLETIONS_PATH,
168188
"chat_completions_path": CHAT_COMPLETIONS_PATH,
169189
}
@@ -384,6 +404,7 @@ def _get_async_client(self) -> httpx.AsyncClient:
384404
http2=self.http2,
385405
timeout=self.timeout,
386406
follow_redirects=self.follow_redirects,
407+
verify=self.verify,
387408
)
388409
self._async_client = client
389410
else:
@@ -395,16 +416,7 @@ def _headers(self) -> dict[str, str]:
395416
headers = {
396417
"Content-Type": "application/json",
397418
}
398-
399-
if self.authorization:
400-
headers["Authorization"] = self.authorization
401-
402-
if self.organization:
403-
headers["OpenAI-Organization"] = self.organization
404-
405-
if self.project:
406-
headers["OpenAI-Project"] = self.project
407-
419+
headers.update(self.headers)
408420
return headers
409421

410422
def _params(self, endpoint_type: EndpointType) -> dict[str, str]:

src/guidellm/config.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,10 +81,12 @@ class OpenAISettings(BaseModel):
8181

8282
api_key: Optional[str] = None
8383
bearer_token: Optional[str] = None
84+
headers: Optional[dict[str, str]] = None
8485
organization: Optional[str] = None
8586
project: Optional[str] = None
8687
base_url: str = "http://localhost:8000"
8788
max_output_tokens: int = 16384
89+
verify: bool = True
8890

8991

9092
class ReportGenerationSettings(BaseModel):

tests/unit/backend/test_openai_backend.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ def test_openai_http_backend_default_initialization():
1111
backend = OpenAIHTTPBackend()
1212
assert backend.target == settings.openai.base_url
1313
assert backend.model is None
14-
assert backend.authorization == settings.openai.bearer_token
14+
assert backend.headers.get("Authorization") == settings.openai.bearer_token
1515
assert backend.organization == settings.openai.organization
1616
assert backend.project == settings.openai.project
1717
assert backend.timeout == settings.request_timeout
@@ -37,7 +37,7 @@ def test_openai_http_backend_intialization():
3737
)
3838
assert backend.target == "http://test-target"
3939
assert backend.model == "test-model"
40-
assert backend.authorization == "Bearer test-key"
40+
assert backend.headers.get("Authorization") == "Bearer test-key"
4141
assert backend.organization == "test-org"
4242
assert backend.project == "test-proj"
4343
assert backend.timeout == 10

0 commit comments

Comments
 (0)