-
Notifications
You must be signed in to change notification settings - Fork 86
[GuideLLM Refactor] benchmark package updates and rewrites #356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GuideLLM Refactor] benchmark package updates and rewrites #356
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
6b3331f
to
a88605e
Compare
Signed-off-by: Mark Kurtz <[email protected]>
Signed-off-by: Mark Kurtz <[email protected]>
2515465
to
4834767
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
src/guidellm/benchmark/progress.py:1
- Return type annotation includes
SynchronousProfile
andThroughputProfile
but the method returns strategy instances, not profile instances. These should beSynchronousStrategy
andThroughputStrategy
.
"""
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Signed-off-by: Mark Kurtz <[email protected]>
…nto features/refactor/base-draft [GuideLLM Refactor] benchmark package updates and rewrites #356
…eatures/refactor/benchmarker
Summary
Introduces a comprehensive refactor of the benchmarking system, replacing the previous architecture with a more flexible and extensible design. The changes include new aggregation protocols, enhanced benchmark objects with comprehensive metrics, and improved progress tracking capabilities. This refactor enables better separation of concerns, more granular metric collection, and improved real-time monitoring of benchmark execution.
Details
BenchmarkAggregator
with protocol-basedAggregator
andCompilableAggregator
interfaces, enabling composable metric collection and compilationobjects.py
with comprehensive metrics including timing distributions, token statistics, and performance measurementsBenchmarker
class to coordinate request scheduling, data aggregation, and result compilation with thread-safe singleton patternbenchmark_generative_text
function with improved configuration options, validation, and error handlingKey Components Added:
SchedulerStatsAggregator
: Collects scheduler timing and performance metricsGenerativeRequestsAggregator
: Compiles complete generative benchmark results with warmup/cooldown filteringGenerativeStatsProgressAggregator
: Tracks real-time generation metrics during executionBenchmarkerProgressGroup
: Composite progress handler for multiple tracking instancesGenerativeBenchmarkerOutput
: Pluggable output system with multiple format supportBreaking Changes:
BenchmarkAggregator
andGenerativeBenchmarkAggregator
classesBenchmarker.run()
method signature and return typeTest Plan
Related Issues
Use of AI
## WRITTEN BY AI ##
)