Skip to content

Commit 29054ae

Browse files
committed
Fixes for docs from reviews and styling/pre-commit fixes
Signed-off-by: Mark Kurtz <[email protected]>
1 parent c587ee0 commit 29054ae

File tree

6 files changed

+24
-14
lines changed

6 files changed

+24
-14
lines changed

README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ SLO-aware Benchmarking and Evaluation Platform for Optimizing Real-World LLM Inf
2222

2323
**GuideLLM** is a platform for evaluating how language models perform under real workloads and configurations. It simulates end-to-end interactions with OpenAI-compatible and vLLM-native servers, generates workload patterns that reflect production usage, and produces detailed reports that help teams understand system behavior, resource needs, and operational limits. GuideLLM supports real and synthetic datasets, multimodal inputs, and flexible execution profiles, giving engineering and ML teams a consistent framework for assessing model behavior, tuning deployments, and planning capacity as their systems evolve.
2424

25-
## Why GuideLLM?
25+
### Why GuideLLM?
2626

2727
GuideLLM gives teams a clear picture of performance, efficiency, and reliability when deploying LLMs in production-like environments.
2828

@@ -144,6 +144,8 @@ The console provides a lightweight summary with high-level statistics for each b
144144

145145
This file is the authoritative record of the entire benchmark session. It includes configuration, metadata, per-benchmark statistics, and sample request entries with individual request timings. Use it for debugging, deeper analysis, or loading into Python with `GenerativeBenchmarksReport`.
146146

147+
Alternatively, a yaml version of this file can be generated for easier human readability with the same content as `benchmarks.json` using the `--outputs yaml` argument.
148+
147149
**benchmarks.csv**
148150

149151
This file provides a compact tabular view of each benchmark with the fields most commonly used for reporting—throughput, latency percentiles, token counts, and rate information. It opens cleanly in spreadsheets and BI tools and is well-suited for comparisons across runs.
@@ -158,7 +160,7 @@ GuideLLM supports a wide range of LLM benchmarking workflows. The examples below
158160

159161
### Load Patterns
160162

161-
Different applications require different traffic shapes. This example demonstrates rate-based load testing using a constant profile at 10 requests per second, running for 20 seconds with synthetic data of 128 prompt tokens and 256 output tokens.
163+
Simmulating different applications requires different traffic shapes. This example demonstrates rate-based load testing using a constant profile at 10 requests per second, running for 20 seconds with synthetic data of 128 prompt tokens and 256 output tokens.
162164

163165
```bash
164166
guidellm benchmark \
@@ -191,6 +193,7 @@ guidellm benchmark \
191193
- `--data`: Data source specification - accepts HuggingFace dataset IDs (prefix with `hf:`), local file paths (`.json`, `.csv`, `.jsonl`, `.txt`), or synthetic data configs (JSON object or `key=value` pairs like `prompt_tokens=256,output_tokens=128`)
192194
- `--data-args`: JSON object of arguments for dataset creation - commonly used to specify column mappings like `prompt_column`, `output_tokens_count_column`, or HuggingFace dataset parameters
193195
- `--data-samples`: Number of samples to use from the dataset - use `-1` (default) for all samples with dynamic generation, or specify a positive integer to limit sample count
196+
- `--processor`: Tokenizer or processor name used for generating synthetic data - if not provided and required for the dataset, automatically loads from the model; accepts HuggingFace model IDs or local paths
194197

195198
### Request Types and API Targets
196199

@@ -205,8 +208,7 @@ guidellm benchmark \
205208

206209
**Key parameters:**
207210

208-
- `--request-type`: Specifies the API endpoint format - options include `chat_completions` (chat API format), `completions` (text completion format), and other OpenAI-compatible request types
209-
- `--processor`: Tokenizer or processor name for token counting - if not provided, automatically loads from the model; accepts HuggingFace model IDs or local paths
211+
- `--request-type`: Specifies the API endpoint format - options include `chat_completions` (chat API format), `completions` (text completion format), `audio_transcription` (audio transcription), and `audio_translation` (audio translation).
210212

211213
### Using Scenarios
212214

@@ -236,7 +238,7 @@ guidellm benchmark \
236238

237239
**Key parameters:**
238240

239-
- `--warmup`: Warm-up specification - values between 0 and 1 represent a percentage of total requests/time, values ≥1 represent absolute request or time counts (interpretation depends on active constraint)
241+
- `--warmup`: Warm-up specification - values between 0 and 1 represent a percentage of total requests/time, values ≥1 represent absolute request or time units.
240242
- `--cooldown`: Cool-down specification - same format as warmup, excludes final portion of benchmark from analysis to avoid shutdown effects
241243
- `--max-seconds`: Maximum duration in seconds for each benchmark before automatic termination
242244
- `--max-requests`: Maximum number of requests per benchmark before automatic termination

docs/getting-started/analyze.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ For deeper analysis, GuideLLM saves detailed results to multiple files by defaul
7575
GuideLLM supports multiple output formats that can be customized:
7676

7777
- **JSON**: Complete benchmark data in JSON format with full request samples
78+
- **YAML**: Complete benchmark data in YAML format with full request samples
7879
- **CSV**: Summary of key metrics in CSV format suitable for spreadsheets
7980
- **HTML**: Interactive HTML report with tables and visualizations
8081
- **Console**: Terminal output displayed during execution
@@ -85,6 +86,12 @@ To specify which formats to generate, use the `--outputs` argument:
8586
guidellm benchmark --target "http://localhost:8000" --outputs json csv
8687
```
8788

89+
The `--outputs` argument additionally accepts full file names to further customize/differentiate outputs:
90+
91+
```bash
92+
guidellm benchmark --target "http://localhost:8000" --outputs results/benchmarks.json results/summary.csv
93+
```
94+
8895
To change the output directory, use the `--output-dir` argument:
8996

9097
```bash

docs/getting-started/benchmark.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ To run a benchmark against your local vLLM server with default settings:
1919
```bash
2020
guidellm benchmark \
2121
--target "http://localhost:8000" \
22-
--data "prompt_tokens=256,output_tokens=128"
22+
--data "prompt_tokens=256,output_tokens=128" \
23+
--max-seconds 60
2324
```
2425

2526
This command:
@@ -63,7 +64,7 @@ GuideLLM supports several benchmark profiles and strategies:
6364

6465
### Data Options
6566

66-
For synthetic data, you can customize:
67+
For synthetic data, some key options include, among others:
6768

6869
- `prompt_tokens`: Average number of tokens for prompts
6970
- `output_tokens`: Average number of tokens for outputs
@@ -72,7 +73,7 @@ For synthetic data, you can customize:
7273
For a complete list of options, run:
7374

7475
```bash
75-
guidellm benchmark --help
76+
guidellm benchmark run --help
7677
```
7778

7879
## Working with Real Data

docs/guides/outputs.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -55,14 +55,15 @@ GuideLLM supports saving benchmark results to files in various formats, includin
5555
### Supported File Formats
5656

5757
1. **JSON**: Contains all benchmark results, including full statistics and request data. This format is ideal for reloading into Python for in-depth analysis.
58-
2. **CSV**: Provides a summary of the benchmark data, focusing on key metrics and statistics. Note that CSV does not include detailed request-level data.
59-
3. **HTML**: Interactive HTML report with tables and visualizations of benchmark results.
60-
4. **Console**: Terminal output displayed during execution (can be disabled).
58+
2. **YAML**: Contains all benchmark results, including full statistics and request data, in YAML format which is human-readable and easy to work with in various tools.
59+
3. **CSV**: Provides a summary of the benchmark data, focusing on key metrics and statistics. Note that CSV does not include detailed request-level data.
60+
4. **HTML**: Interactive HTML report with tables and visualizations of benchmark results.
61+
5. **Console**: Terminal output displayed during execution (can be disabled).
6162

6263
### Configuring File Outputs
6364

6465
- **Output Directory**: Use the `--output-dir` argument to specify the directory for saving the results. By default, files are saved in the current directory.
65-
- **Output Formats**: Use the `--outputs` argument to specify which formats to generate. By default, JSON, CSV, and HTML are generated.
66+
- **Output Formats**: Use the `--outputs` argument to specify which formats or exact file names (with supported file extensions, e.g. `benchmarks.json`) to generate. By default, JSON, CSV, and HTML are generated.
6667
- **Sampling**: To limit the size of the output files and number of detailed request samples included, you can configure sampling options using the `--sample-requests` argument.
6768

6869
Example command to save results in specific formats:

mkdocs.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,4 +118,3 @@ extra_css:
118118
extra_javascript:
119119
- scripts/mathjax.js
120120
- https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js
121-

src/guidellm/__main__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
Example:
1212
::
1313
# Run a benchmark against a model
14-
guidellm benchmark --target http://localhost:8000 --data dataset.json \\
14+
guidellm benchmark run --target http://localhost:8000 --data dataset.json \\
1515
--profile sweep
1616
1717
# Preprocess a dataset

0 commit comments

Comments
 (0)