-
Notifications
You must be signed in to change notification settings - Fork 127
Open
Labels
needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.Indicates an issue or PR lacks a `triage/foo` label and requires one.
Description
What would you like to be added:
Enhance P/D decision logic to consider system state (queue depths, worker load, predicted latencies, etc.), not just non-cached token counts. Allow the scheduler to incorporate dynamic metrics when deciding whether to use disaggregated prefill/decode.
Why is this needed:
Current token-threshold-based decisions can't adapt to provisioning or workload changes. Example: if workload shifts from 10k→20k prompt tokens and prefill workers become overloaded, the scheduler should automatically reduce P/D usage to avoid bottlenecks. System-aware decisions would enable dynamic adaptation without manual threshold tuning.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.Indicates an issue or PR lacks a `triage/foo` label and requires one.
Type
Projects
Status
Backlog