Commit 18e95be
authored
[GuideLLM Refactor] benchmark package updates and rewrites (#356)
## **Summary**
Introduces a comprehensive refactor of the benchmarking system,
replacing the previous architecture with a more flexible and extensible
design. The changes include new aggregation protocols, enhanced
benchmark objects with comprehensive metrics, and improved progress
tracking capabilities. This refactor enables better separation of
concerns, more granular metric collection, and improved real-time
monitoring of benchmark execution.
## **Details**
- **New Aggregation System**: Replaced `BenchmarkAggregator` with
protocol-based `Aggregator` and `CompilableAggregator` interfaces,
enabling composable metric collection and compilation
- **Enhanced Benchmark Objects**: Refactored benchmark data models
in `objects.py` with comprehensive metrics including timing
distributions, token statistics, and performance measurements
- **Improved Benchmarker**: Redesigned `Benchmarker` class to coordinate
request scheduling, data aggregation, and result compilation with
thread-safe singleton pattern
- **Flexible Output System**: Added pluggable output formatters
supporting console, CSV, HTML, and JSON formats with configurable file
paths
- **Advanced Progress Tracking**: Implemented composite progress
handlers with real-time console display showing detailed metrics, timing
information, and progress bars
- **Profile System Enhancements**: Enhanced profile configurations with
better strategy generation, constraint management, and completion
tracking
- **Comprehensive Entrypoints**:
Redesigned `benchmark_generative_text` function with improved
configuration options, validation, and error handling
### Key Components Added:
- `SchedulerStatsAggregator`: Collects scheduler timing and performance
metrics
- `GenerativeRequestsAggregator`: Compiles complete generative benchmark
results with warmup/cooldown filtering
- `GenerativeStatsProgressAggregator`: Tracks real-time generation
metrics during execution
- `BenchmarkerProgressGroup`: Composite progress handler for multiple
tracking instances
- `GenerativeBenchmarkerOutput`: Pluggable output system with multiple
format support
### Breaking Changes:
-
Removed `BenchmarkAggregator` and `GenerativeBenchmarkAggregator` classes
- Restructured benchmark object hierarchy and field names
- Modified `Benchmarker.run()` method signature and return type
- Updated progress tracking interfaces and event handling
## **Test Plan**
- Tests to be added in a subsequent PR
## **Related Issues**
- Part of the larger scheduler refactor initiative
---
- [x] "I certify that all code in this PR is my own, except as noted
below."
## **Use of AI**
- [x] Includes AI-assisted code completion
- [x] Includes code generated by an AI application
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a
docstring that includes `## WRITTEN BY AI ##`)File tree
10 files changed
+4077
-3548
lines changed- src/guidellm/benchmark
10 files changed
+4077
-3548
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | | - | |
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
3 | 14 | | |
4 | | - | |
5 | 15 | | |
6 | | - | |
| 16 | + | |
7 | 17 | | |
8 | 18 | | |
| 19 | + | |
9 | 20 | | |
10 | | - | |
11 | | - | |
12 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
13 | 28 | | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | 29 | | |
18 | 30 | | |
19 | 31 | | |
| |||
22 | 34 | | |
23 | 35 | | |
24 | 36 | | |
25 | | - | |
26 | 37 | | |
27 | 38 | | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
32 | 42 | | |
33 | 43 | | |
34 | 44 | | |
35 | | - | |
| 45 | + | |
| 46 | + | |
36 | 47 | | |
37 | 48 | | |
38 | | - | |
39 | | - | |
40 | 49 | | |
41 | | - | |
| 50 | + | |
42 | 51 | | |
43 | 52 | | |
44 | | - | |
45 | | - | |
46 | | - | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
47 | 56 | | |
48 | 57 | | |
49 | | - | |
50 | | - | |
51 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
52 | 62 | | |
| 63 | + | |
53 | 64 | | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
58 | 69 | | |
59 | 70 | | |
60 | | - | |
| 71 | + | |
| 72 | + | |
61 | 73 | | |
62 | 74 | | |
63 | 75 | | |
64 | 76 | | |
65 | | - | |
66 | 77 | | |
67 | 78 | | |
0 commit comments