Skip to content

Commit ad14178

Browse files
authored
[GuideLLM Refactor] scheduler package updates, rewrites, and tests expansion (#354)
## **Summary** Introduces a comprehensive constraints system and enhanced timing control for the scheduler refactor. The implementation moves from hardcoded execution limits to a flexible, composable constraint system that enables sophisticated benchmark stopping criteria. Additionally, request timing calculations are moved from precalculated to per-request basis, enabling dynamic rate adjustments and better distributed coordination. ## **Details** - **Added constraints system** (`constraints.py`): Implements Protocol-based constraint architecture with support for request limits, duration limits, error thresholds, and sliding window error rates - `MaxNumberConstraint`: Limits execution based on request count - `MaxDurationConstraint`: Limits execution based on time duration - `MaxErrorsConstraint`: Limits execution based on absolute error count - `MaxErrorRateConstraint`: Limits execution based on sliding window error rate - `MaxGlobalErrorRateConstraint`: Limits execution based on global error rate - `ConstraintsInitializerFactory`: Registry system for constraint creation and serialization - **Refactored core objects** (`objects.py`): Replaced `result.py` and expanded capabilities - Made scheduler package fully generic, decoupling from backend-specific types - Added `BackendInterface` protocol for type-safe backend integration - Enhanced `ScheduledRequestInfo` with comprehensive timing and status tracking - Added `SchedulerState` for distributed state coordination - Introduced `SchedulerUpdateAction` for constraint-based control signals - **Enhanced scheduling strategies** (`strategy.py`): Introduced request timing abstractions - Added `ScheduledRequestTimings` base class for timing implementations - `LastCompletionRequestTimings`: For synchronous and concurrent strategies - `NoDelayRequestTimings`: For maximum throughput strategies - `ConstantRateRequestTimings`: For fixed-rate scheduling - `PoissonRateRequestTimings`: For stochastic request patterns - Strategies now create per-worker timing instances instead of precalculated schedules - **Added environment abstractions** (`environment.py`): Coordination layer for distributed execution - `Environment` protocol for distributed synchronization - `NonDistributedEnvironment` implementation for single-node execution - **Worker process management** (`worker.py`, `worker_group.py`): Distributed request processing infrastructure - Individual worker process management with lifecycle coordination - Multi-process orchestration with state synchronization - Constraint evaluation and graceful shutdown coordination ## **Test Plan** - Full unit tests and some integration tests added and passing ## **Related Issues** - Part of scheduler refactor initiative to support distributed benchmarking --- - [x] "I certify that all code in this PR is my own, except as noted below." ## **Use of AI** - [x] Includes AI-assisted code completion - [x] Includes code generated by an AI application - [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)
2 parents 4c70b5b + a7ae737 commit ad14178

20 files changed

+9298
-1496
lines changed

src/guidellm/benchmark/scenario.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
from guidellm.backend.backend import BackendType
1313
from guidellm.benchmark.profile import ProfileType
1414
from guidellm.objects.pydantic import StandardBaseModel
15-
from guidellm.scheduler.strategy import StrategyType
15+
from guidellm.scheduler.strategies import StrategyType
1616

1717
__ALL__ = ["Scenario", "GenerativeTextScenario", "get_builtin_scenarios"]
1818

src/guidellm/scheduler/__init__.py

Lines changed: 69 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,90 @@
1-
from .result import (
2-
SchedulerRequestInfo,
3-
SchedulerRequestResult,
4-
SchedulerResult,
5-
SchedulerRunInfo,
1+
from .constraints import (
2+
Constraint,
3+
ConstraintInitializer,
4+
ConstraintsInitializerFactory,
5+
MaxDurationConstraint,
6+
MaxErrorRateConstraint,
7+
MaxErrorsConstraint,
8+
MaxGlobalErrorRateConstraint,
9+
MaxNumberConstraint,
10+
PydanticConstraintInitializer,
11+
SerializableConstraintInitializer,
12+
UnserializableConstraintInitializer,
13+
)
14+
from .environments import Environment, NonDistributedEnvironment
15+
from .objects import (
16+
BackendInterface,
17+
BackendT,
18+
MeasuredRequestTimings,
19+
MultiTurnRequestT,
20+
RequestSchedulerTimings,
21+
RequestT,
22+
ResponseT,
23+
ScheduledRequestInfo,
24+
SchedulerMessagingPydanticRegistry,
25+
SchedulerState,
26+
SchedulerUpdateAction,
27+
SchedulerUpdateActionProgress,
628
)
729
from .scheduler import Scheduler
8-
from .strategy import (
30+
from .strategies import (
931
AsyncConstantStrategy,
1032
AsyncPoissonStrategy,
1133
ConcurrentStrategy,
34+
ConstantRateRequestTimings,
35+
LastCompletionRequestTimings,
36+
NoDelayRequestTimings,
37+
PoissonRateRequestTimings,
38+
ScheduledRequestTimings,
1239
SchedulingStrategy,
40+
StrategyT,
1341
StrategyType,
1442
SynchronousStrategy,
1543
ThroughputStrategy,
16-
strategy_display_str,
17-
)
18-
from .worker import (
19-
GenerativeRequestsWorker,
20-
GenerativeRequestsWorkerDescription,
21-
RequestsWorker,
22-
ResolveStatus,
23-
WorkerDescription,
24-
WorkerProcessResult,
2544
)
45+
from .worker import WorkerProcess
46+
from .worker_group import WorkerProcessGroup
2647

2748
__all__ = [
2849
"AsyncConstantStrategy",
2950
"AsyncPoissonStrategy",
51+
"BackendInterface",
52+
"BackendT",
3053
"ConcurrentStrategy",
31-
"GenerativeRequestsWorker",
32-
"GenerativeRequestsWorkerDescription",
33-
"RequestsWorker",
34-
"ResolveStatus",
54+
"ConstantRateRequestTimings",
55+
"Constraint",
56+
"ConstraintInitializer",
57+
"ConstraintsInitializerFactory",
58+
"Environment",
59+
"LastCompletionRequestTimings",
60+
"MaxDurationConstraint",
61+
"MaxErrorRateConstraint",
62+
"MaxErrorsConstraint",
63+
"MaxGlobalErrorRateConstraint",
64+
"MaxNumberConstraint",
65+
"MeasuredRequestTimings",
66+
"MultiTurnRequestT",
67+
"NoDelayRequestTimings",
68+
"NonDistributedEnvironment",
69+
"PoissonRateRequestTimings",
70+
"PydanticConstraintInitializer",
71+
"RequestSchedulerTimings",
72+
"RequestT",
73+
"ResponseT",
74+
"ScheduledRequestInfo",
75+
"ScheduledRequestTimings",
3576
"Scheduler",
36-
"SchedulerRequestInfo",
37-
"SchedulerRequestResult",
38-
"SchedulerResult",
39-
"SchedulerRunInfo",
77+
"SchedulerMessagingPydanticRegistry",
78+
"SchedulerState",
79+
"SchedulerUpdateAction",
80+
"SchedulerUpdateActionProgress",
4081
"SchedulingStrategy",
82+
"SerializableConstraintInitializer",
83+
"StrategyT",
4184
"StrategyType",
4285
"SynchronousStrategy",
4386
"ThroughputStrategy",
44-
"WorkerDescription",
45-
"WorkerProcessResult",
46-
"strategy_display_str",
87+
"UnserializableConstraintInitializer",
88+
"WorkerProcess",
89+
"WorkerProcessGroup",
4790
]

0 commit comments

Comments
 (0)