Skip to content

Conversation

@vMaroon
Copy link
Member

@vMaroon vMaroon commented Jan 21, 2026

Summary

active-active multi-replica scheduler support llm-d/llm-d-kv-cache#212

Summary

Implemented a pod reconciler controller that manages per-pod ZMQ subscribers for KVEvents processing, and the required logic. Also moved the kvevents to the same level of kvcache library, as should have been.

Components:

Added integration + unit tests for all new functionality, updated documentation + examples.

The current integration manages pod discovery through looking at the available pods on every Score call. A pod that does not appear there in any request for 10 minutes is assumed dead.

A proper integration through the data-layer will be implemented once ready in IGW. Tracker: kubernetes-sigs/gateway-api-inference-extension#2017

@elevran elevran self-requested a review January 21, 2026 15:24
vMaroon and others added 3 commits January 21, 2026 22:43
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Co-authored-by: Etai Lev Ran <elevran@gmail.com>
Signed-off-by: Maroon Ayoub <Maroonay@gmail.com>
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
@elevran
Copy link
Collaborator

elevran commented Jan 22, 2026

/lgtm
/approve

@github-actions github-actions bot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 22, 2026
@github-actions github-actions bot merged commit c58cf2d into llm-d:main Jan 22, 2026
8 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in llm-d-inference-scheduler Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm "Looks good to me", indicates that a PR is ready to be merged.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants