Skip to content

Conversation

LukeAVanDrie
Copy link
Contributor

@LukeAVanDrie LukeAVanDrie commented Oct 10, 2025

What type of PR is this?
/kind feature

What this PR does / why we need it:

This PR introduces an experimental Flow Control layer to enhance request admission control, providing more sophisticated queuing, fairness, and multi-tenancy capabilities beyond the current saturation-based load shedding.

The key changes include:

  1. Feature Flag: All new flow control functionality is gated behind the ENABLE_EXPERIMENTAL_FLOW_CONTROL_LAYER environment variable. When disabled, the existing admission behavior is preserved.

  2. Admission Control Refactoring:

    • An AdmissionController interface has been introduced in pkg/epp/requestcontrol to abstract admission control logic.
    • LegacyAdmissionController: Implements the existing saturation-based shedding.
    • FlowControlAdmissionController: Integrates with the new Flow Control system. This controller includes logic to adapt requests to the Flow Control interface and translate outcomes.
    • The Director now uses the AdmissionController interface, making the admission strategy pluggable.
  3. Component Wiring:

    • cmd/epp/runner/runner.go initializes the FlowController and FlowRegistry when the feature flag is active.
    • The runner instantiates and injects the appropriate AdmissionController (either Legacy or FlowControl) into the Director.
  4. Configuration:

    • The Flow Control system is currently initialized with a hardcoded default configuration in runner.go. Follow-up PRs will address dynamic configuration.
  5. Testing:

    • Comprehensive table-driven unit tests have been added in pkg/epp/requestcontrol/admission_test.go to cover both LegacyAdmissionController and FlowControlAdmissionController behaviors, including edge cases and outcome translations.
    • Tests for the flowControlRequest adapter have been included.
    • pkg/epp/requestcontrol/director_test.go has been refactored to mock the AdmissionController interface, focusing tests on the Director's orchestration logic.
    • Integration tests in test/integration/epp/hermetic_test.go are updated to use the LegacyAdmissionController to ensure no change in default behavior.
    • All test files have been polished for clarity and adherence to project standards.

This refactoring improves the design's modularity, testability, and extensibility, allowing for easier development and maintenance of admission control strategies.

Which issue(s) this PR fixes:
Fixes #674

Does this PR introduce a user-facing change?:

feat: Introduce experimental flow control layer gated by the `ENABLE_EXPERIMENTAL_FLOW_CONTROL_LAYER` feature flag.

This commit introduces the initial integration of the new Flow Control
layer into the Endpoint Picker (EPP). The primary goal is to provide a
mechanism for more sophisticated request admission control, queuing,
and multitenancy management.

This new layer is gated by the ENABLE_EXPERIMENTAL_FLOW_CONTROL_LAYER
feature flag.

When enabled, the Director delegates admission control decisions to the
new flowController component. The legacy saturation-based shedding
logic is preserved for when the feature is disabled.

Comprehensive unit tests for the admitRequest function have been added
to validate the behavior of both the legacy and new flow control paths.
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 10, 2025
Copy link

netlify bot commented Oct 10, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 0919d3f
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68ed890eedf97b00089ab23b
😎 Deploy Preview https://deploy-preview-1701--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 10, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 10, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @LukeAVanDrie. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 10, 2025
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on my test isolation changes in #1627 to land before I add the new FC cases. Will send these out in a separate PR. Right now, I want to unblock others from experimenting with the component. I have of course deployed and benchmarked this to verify the integration is working as expected.

)

// TODO: this is hardcoded for POC only. This needs to be hooked up to our text-based config story.
var flowControlConfig = flowcontrol.Config{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is admittedly sloppy. I will send out follow PRs to align this with how the rest of our pluggable layers are configured (e.g., the text-based config). For now, I am prioritizing getting the POC checked in with a functional e2e story.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls create issues to track the followup items

@LukeAVanDrie
Copy link
Contributor Author

/assign @kfswain
/assign @ahg-g

@ahg-g
Copy link
Contributor

ahg-g commented Oct 12, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 12, 2025
@kfswain
Copy link
Collaborator

kfswain commented Oct 12, 2025

Minor stylistic comments, overall looks good! Pumped to see this integrated!

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, LukeAVanDrie

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 12, 2025
This commit refactors the admission control logic introduced in the
previous commit.

- Introduces an AdmissionController interface.
- Implements LegacyAdmissionController for existing behavior.
- Implements FlowControlAdmissionController for the new flow control
	path.
- Moves flow control request adaptation into
	FlowControlAdmissionController.
- Updates Director to use the AdmissionController interface.
- Runner now injects the appropriate controller based on the feature
	flag.
- Splits and refactors tests for better focus and isolation.

This improves modularity, testability, and separation of concerns.
@LukeAVanDrie LukeAVanDrie force-pushed the feat/flow-control-integration branch from 82605c7 to 6114ca5 Compare October 13, 2025 17:19
Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, a couple of nits

- Introduce requtil.IsSheddable() for clarity
- Move FairnessID defaulting to handler package
- Add test for default FairnessID
@ahg-g
Copy link
Contributor

ahg-g commented Oct 13, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2025
@ahg-g
Copy link
Contributor

ahg-g commented Oct 13, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2025
@k8s-ci-robot k8s-ci-robot merged commit ef666c1 into kubernetes-sigs:main Oct 13, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce EPP-Level Queuing/Fairness for InferenceModels

4 participants