-
Notifications
You must be signed in to change notification settings - Fork 195
Description
What would you like to be added:
EPP As a Standalone Request Scheduler mentioned using Endpoint Discovery with Pod Selector . This issue is to track the investigation of endpoint discovery with pod selector.
Why is this needed:
The endpoint picker (EPP), an intelligent scheduler for LLM inference requests, is currently deployed as an ext-proc service tightly coupled to the Kubernetes Gateway API, a setup that adds unnecessary operational overhead for non-serving workloads like Reinforcement Learning (RL) post-training. The proof of concept for a "Standalone EPP" aims to solve this by creating a decoupled deployment mode. This new approach will be tested by packaging the EPP and an Envoy proxy as a single sidecar-based unit, bypassing the Gateway API and instead using a simple command-line pod selector to discover and manage model server endpoints, thereby lowering the barrier for adopting the EPP's advanced scheduling logic for critical RL and batch inference jobs.