You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -68,12 +68,12 @@ For information on starting other supported inference servers or platforms, see
68
68
69
69
#### 2. Run a GuideLLM Benchmark
70
70
71
-
To run a GuideLLM benchmark, use the `guidellm benchmark` command with the target set to an OpenAI-compatible server. For this example, the target is set to 'http://localhost:8000', assuming that vLLM is active and running on the same server. Otherwise, update it to the appropriate location. By default, GuideLLM automatically determines the model available on the server and uses it. To target a different model, pass the desired name with the `--model` argument. Additionally, the `--rate-type` is set to `sweep`, which automatically runs a range of benchmarks to determine the minimum and maximum rates that the server and model can support. Each benchmark run under the sweep will run for 30 seconds, as set by the `--max-seconds` argument. Finally, `--data` is set to a synthetic dataset with 256 prompt tokens and 128 output tokens per request. For more arguments, supported scenarios, and configurations, jump to the [Configurations Section](#configurations) or run `guidellm benchmark --help`.
71
+
To run a GuideLLM benchmark, use the `guidellm benchmark run` command with the target set to an OpenAI-compatible server. For this example, the target is set to 'http://localhost:8000', assuming that vLLM is active and running on the same server. Otherwise, update it to the appropriate location. By default, GuideLLM automatically determines the model available on the server and uses it. To target a different model, pass the desired name with the `--model` argument. Additionally, the `--rate-type` is set to `sweep`, which automatically runs a range of benchmarks to determine the minimum and maximum rates that the server and model can support. Each benchmark run under the sweep will run for 30 seconds, as set by the `--max-seconds` argument. Finally, `--data` is set to a synthetic dataset with 256 prompt tokens and 128 output tokens per request. For more arguments, supported scenarios, and configurations, jump to the [Configurations Section](#configurations) or run `guidellm benchmark --help`.
72
72
73
73
Now, to start benchmarking, run the following command:
74
74
75
75
```bash
76
-
guidellm benchmark \
76
+
guidellm benchmark run \
77
77
--target "http://localhost:8000" \
78
78
--rate-type sweep \
79
79
--max-seconds 30 \
@@ -110,11 +110,11 @@ For further details on determining the optimal request rate and SLOs, refer to t
110
110
111
111
### Configurations
112
112
113
-
GuideLLM offers a range of configurations through both the benchmark CLI command and environment variables, which provide default values and more granular controls. The most common configurations are listed below. A complete list is easily accessible, though, by running `guidellm benchmark --help` or `guidellm config` respectively.
113
+
GuideLLM offers a range of configurations through both the benchmark CLI command and environment variables, which provide default values and more granular controls. The most common configurations are listed below. A complete list is easily accessible, though, by running `guidellm benchmark run --help` or `guidellm config` respectively.
114
114
115
115
#### Benchmark CLI
116
116
117
-
The `guidellm benchmark` command is used to run benchmarks against a generative AI backend/server. The command accepts a variety of arguments to customize the benchmark run. The most common arguments include:
117
+
The `guidellm benchmark run` command is used to run benchmarks against a generative AI backend/server. The command accepts a variety of arguments to customize the benchmark run. The most common arguments include:
118
118
119
119
-`--target`: Specifies the target path for the backend to run benchmarks against. For example, `http://localhost:8000`. This is required to define the server endpoint.
Copy file name to clipboardExpand all lines: docs/outputs.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ GuideLLM provides flexible options for outputting benchmark results, catering to
5
5
For all of the output formats, `--output-extras` can be used to include additional information. This could include tags, metadata, hardware details, and other relevant information that can be useful for analysis. This must be supplied as a JSON encoded string. For example:
6
6
7
7
```bash
8
-
guidellm benchmark \
8
+
guidellm benchmark run \
9
9
--target "http://localhost:8000" \
10
10
--rate-type sweep \
11
11
--max-seconds 30 \
@@ -26,21 +26,21 @@ By default, GuideLLM displays benchmark results and progress directly in the con
26
26
27
27
### Disabling Console Output
28
28
29
-
To disable the progress outputs to the console, use the `disable-progress` flag when running the `guidellm benchmark` command. For example:
29
+
To disable the progress outputs to the console, use the `disable-progress` flag when running the `guidellm benchmark run` command. For example:
30
30
31
31
```bash
32
-
guidellm benchmark \
32
+
guidellm benchmark run \
33
33
--target "http://localhost:8000" \
34
34
--rate-type sweep \
35
35
--max-seconds 30 \
36
36
--data "prompt_tokens=256,output_tokens=128" \
37
37
--disable-progress
38
38
```
39
39
40
-
To disable console output, use the `--disable-console-outputs` flag when running the `guidellm benchmark` command. For example:
40
+
To disable console output, use the `--disable-console-outputs` flag when running the `guidellm benchmark run` command. For example:
41
41
42
42
```bash
43
-
guidellm benchmark \
43
+
guidellm benchmark run \
44
44
--target "http://localhost:8000" \
45
45
--rate-type sweep \
46
46
--max-seconds 30 \
@@ -50,10 +50,10 @@ guidellm benchmark \
50
50
51
51
### Enabling Extra Information
52
52
53
-
GuideLLM includes the option to display extra information during the benchmark runs to monitor the overheads and performance of the system. This can be enabled by using the `--display-scheduler-stats` flag when running the `guidellm benchmark` command. For example:
53
+
GuideLLM includes the option to display extra information during the benchmark runs to monitor the overheads and performance of the system. This can be enabled by using the `--display-scheduler-stats` flag when running the `guidellm benchmark run` command. For example:
54
54
55
55
```bash
56
-
guidellm benchmark \
56
+
guidellm benchmark run \
57
57
--target "http://localhost:8000" \
58
58
--rate-type sweep \
59
59
--max-seconds 30 \
@@ -81,7 +81,7 @@ GuideLLM supports saving benchmark results to files in various formats, includin
0 commit comments