Skip to content

Commit 0f0fae0

Browse files
committed
updates
1 parent 88de5b0 commit 0f0fae0

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/inference_endpoint/testing/variable_throughput_server.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -523,7 +523,7 @@ class VariableResponseServer:
523523
output_len_mean: Mean output sequence length (chars).
524524
output_len_spread: Coefficient of variation for output length.
525525
output_len_min: Minimum output sequence length (chars).
526-
output_len_max: Maximum output sequence length (chars). None = 8 * mean.
526+
output_len_max: Maximum output sequence length (chars). None = 2 * mean.
527527
response_rate_mean: Per-request response rate mean (responses/sec). 0 = no rate mode.
528528
response_rate_spread: CoV for per-request response rate.
529529
inter_token_latency: Per-token delay (TPOT) mean in milliseconds. 0 = no ICL mode.

0 commit comments

Comments
 (0)