Skip to content

feat: add pdrole package for P/D role discovery and detection#735

Open
ev-shindin wants to merge 1 commit intollm-d:mainfrom
ev-shindin:pd-role-discovery
Open

feat: add pdrole package for P/D role discovery and detection#735
ev-shindin wants to merge 1 commit intollm-d:mainfrom
ev-shindin:pd-role-discovery

Conversation

@ev-shindin
Copy link
Collaborator

Summary

  • Add internal/utils/pdrole package that discovers P/D disaggregation configuration from the EPP's EndpointPickerConfig and detects each deployment's P/D role from pod template labels
  • Discovery is per-pool (accepts *pool.EndpointPool), enabling correct behavior when multiple EPPs exist with different P/D label settings
  • Returns PDDiscoveryResult with Disaggregated flag so callers can distinguish "no P/D plugins" (all pods serve both roles) from "P/D plugins found, use label config"

Design decisions

Label-only detection (no deployment name fallback).
The EPP's filter plugins (prefill-filter, decode-filter, by-label from llm-d-inference-scheduler) route traffic based solely on pod labels. A deployment named llama-prefill without a llm-d.ai/role label would still receive decode traffic from EPP (decode-filter has allowsNoLabel=true). Name-based guessing would misclassify it.

BothValues from intersection, not hardcoded.
For by-label custom plugins, values appearing in both prefill and decode profiles' validValues are classified as BothValues. This ensures GetDeploymentPDRole returns RoleBoth (not RolePrefill or RoleDecode) for pods that pass both filters.

Aligned with EPP's actual filter semantics:

EPP filter Label Valid values allowsNoLabel
prefill-filter llm-d.ai/role "prefill" false
decode-filter llm-d.ai/role "decode", "both" true

Package structure

File Purpose
types.go PDRole, PDRoleLabelConfig, PDDiscoveryResult, constants
detect.go GetDeploymentPDRole — label-based role detection
discover.go DiscoverPDRoleLabelConfig — EPP config discovery chain
*_test.go 43 Ginkgo specs

Discovery chain

  1. Pool -> EPP service name/namespace
  2. Service selector -> EPP deployment
  3. Deployment volumes -> mounted ConfigMap
  4. ConfigMap data -> parse EndpointPickerConfig (YAML/JSON)
  5. Plugins -> detect prefill-filter/decode-filter (well-known label) or by-label (custom label from parameters)
  6. Any step fails -> Disaggregated=false, default config

Intended caller pattern (not wired yet)

pool, _ := datastore.PoolGetFromLabels(deploy.Spec.Template.Labels)
result := pdrole.DiscoverPDRoleLabelConfig(ctx, k8sClient, pool)
if !result.Disaggregated {
    // no P/D plugins — all deployments serve both roles
    role = pdrole.RoleBoth
} else {
    role = pdrole.GetDeploymentPDRole(deploy, result.LabelConfig)
}

Test plan

  • go build ./internal/utils/pdrole/... — clean
  • go vet ./internal/utils/pdrole/... — clean
  • go test ./internal/utils/pdrole/... -v -count=1 — 43/43 specs pass
  • No callers yet (infrastructure-only PR), wiring comes in follow-up

Add internal/utils/pdrole package that discovers Prefill/Decode role
configuration from EPP's EndpointPickerConfig and detects each
deployment's P/D role from pod template labels.

Aligned with llm-d-inference-scheduler's actual behavior:
- Label-only detection (no deployment name fallback). EPP's filter
  plugins (prefill-filter, decode-filter, by-label) route traffic
  based solely on pod labels, never deployment names.
- prefill-filter: accepts only "prefill" labeled pods (allowsNoLabel=false)
- decode-filter: accepts "decode"/"both" labeled + unlabeled pods
  (allowsNoLabel=true)
- by-label: values appearing in both prefill and decode profiles'
  validValues are classified as BothValues (intersection logic)

Discovery accepts a per-pool EndpointPool (not a global EPP name),
enabling correct behavior when multiple EPPs have different P/D
label settings. Returns PDDiscoveryResult with Disaggregated flag
so callers can distinguish "no P/D plugins" (treat all as RoleBoth)
from "P/D plugins found, use label config for detection".
@ev-shindin ev-shindin self-assigned this Feb 15, 2026
@ev-shindin ev-shindin linked an issue Feb 15, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support P/D (Prefill/Decode) Disaggregation

1 participant