Skip to content

Conversation

pallavijaini0525
Copy link

In prefill/decode (PD) disaggregation deployments, the prefill-header-handler plugin using the same targetPort for both the routing sidecar and prefill nodes. This resulted in incorrect x-prefiller-host-port headers being generated when prefill is running on a different port, preventing proper communication between decode and prefill workers.

Added an optional prefillTargetPort parameter to the prefill-header-handler plugin configuration. When specified, this parameter overrides the generic targetPort when constructing the x-prefiller-host-port header.

@shmuelk
Copy link
Collaborator

shmuelk commented Sep 25, 2025

/rebase

@elevran elevran moved this to In progress in llm-d-inference-scheduler Oct 19, 2025
@elevran elevran moved this from In progress to In review in llm-d-inference-scheduler Oct 20, 2025
@elevran
Copy link
Collaborator

elevran commented Oct 20, 2025

@pallavijaini0525 the PR is failing CI (linting) - kindly rebase and fix before we can move it forward.

@elevran elevran moved this from In review to In progress in llm-d-inference-scheduler Oct 20, 2025
Signed-off-by: Pallavi Jaini <[email protected]>
@pallavijaini0525
Copy link
Author

@elevran -> Lint issues are fixed. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants