Skip to content

Add DisaggDeployment CRD for coordinated prefill/decode workload management #766

@hasB4K

Description

@hasB4K

What would you like to be added:

A new DisaggDeployment CRD that manages two LeaderWorkerSets (prefill + decode) as a single logical unit, with coordinated rolling updates and
service orchestration.

Why is this needed:

Disaggregated LLM inference (prefill/decode separation) requires coordinating two LeaderWorkerSets. Currently users must manually manage both
resources and handle rolling updates across them, which is error-prone and complex.

Completion requirements:

  • DisaggDeployment CRD with prefill and decode side configurations
  • Two-dimensional coordinated rolling updates across both sides
  • Automatic Service creation when both sides are ready

This enhancement requires the following artifacts:

  • KEP
  • Implementation
  • Docs update

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions