Bug fixes #173

dagrayvid · 2025-05-27T14:38:17Z

This PR fixes two small bugs.

Double counting of prompt tokens in the calculation of total tokens_per_second
The warmup percent was not working when using max duration (worked for max requests). Traced this to a - which should have been a + in benchmark/aggregator.py

dagrayvid added 2 commits May 27, 2025 10:35

Fix double counting of prompt tokens in tokens/sec

c0bc027

Fix filter for warmup duration

a430e6f

dagrayvid requested a review from markurtz May 27, 2025 15:19

markurtz approved these changes May 29, 2025

View reviewed changes

markurtz merged commit 66017f5 into vllm-project:main May 29, 2025
9 of 10 checks passed

Provide feedback