Skip to content

adding fairness-id header to be used in flow control #1282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 3, 2025

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Aug 1, 2025

Fixes: #1245

Some other small fixes also included (config updates mostly)

There is not currently usage for the flag, it will be integrated with @LukeAVanDrie's flow control. Currently just stubbing via log in admissionControl

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2025
@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

netlify bot commented Aug 1, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 8e64a83
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/688f69d49b1dc400083b671f
😎 Deploy Preview https://deploy-preview-1282--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 1, 2025
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 1, 2025
@kfswain kfswain marked this pull request as ready for review August 1, 2025 19:30
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2025
@k8s-ci-robot k8s-ci-robot requested a review from ahg-g August 1, 2025 19:30
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 2, 2025
Comment on lines +108 to +120
destinationEndpointHintKey = flag.String(
"destination-endpoint-hint-key",
runserver.DefaultDestinationEndpointHintKey,
"Header and response metadata key used by Envoy to route to the appropriate pod. This must match Envoy configuration.")
destinationEndpointHintMetadataNamespace = flag.String(
"destination-endpoint-hint-metadata-namespace",
runserver.DefaultDestinationEndpointHintMetadataNamespace,
"The key for the outer namespace struct in the metadata field of the extproc response that is used to wrap the"+
"target endpoint. If not set, then an outer namespace struct should not be created.")
fairnessIDHeaderKey = flag.String(
"fairness-id-header-key",
runserver.DefaultFairnessIDHeaderKey,
"The header key used to pass the fairness ID to be used in Flow Control.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have an open issue #1267 which is about removing destination headers from cmd line args.
this looks more of the same.

the main point is that protocol specifics shouldn't be configurable. this is the protocol and whoever wants to use GIE should use the headers according to the protocol.. so IMO fairness id header shouldn't be configurable same as the other two.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure. How did you want to handle the transition? My thinking was that until we do that transition, we want to keep the functionality similar.

I can remove this flag in this PR but then its special cased until we remove the others. LMKWYT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the opposite direction.. instead of adding third flag that should be removed, we could remove in this PR the two other flags and mark #1267 as fixed.

but that's ok we can continue with current version and handle that issue separately.

Copy link
Contributor

@nirrozenbaum nirrozenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went over the PR.
IMO fairness ID header shouldn't be configurable, it's part of the protocol.
generally speaking, I think protocol specifics shouldn't be configurable.
if we revert this change, this PR becomes ~50 code lines.

with this PR size, I'm not sure it's really needed to split to "adding the header" PR and "using the header" PR.
I'd prefer to see them together to avoid places were we pass arguments that are not really used and would later need to be cleaned (e.g., admitRequest now gets fairnessID, will it be used there? maybe yes, maybe not.. depends on how the fairnessID handling will be done).

@kfswain
Copy link
Collaborator Author

kfswain commented Aug 3, 2025

I'd prefer to see them together to avoid places were we pass arguments that are not really used and would later need to be cleaned (e.g., admitRequest now gets fairnessID, will it be used there? maybe yes, maybe not.. depends on how the fairnessID handling will be done).

I dont think there's anything wrong with stubbing out a flag we intend to use. This was something that was suggested as a change to the API, the only reason it's not consumed is because another engineer is implementing flow control. Splitting up work allows us all to go faster?

@nirrozenbaum
Copy link
Contributor

I'd prefer to see them together to avoid places were we pass arguments that are not really used and would later need to be cleaned (e.g., admitRequest now gets fairnessID, will it be used there? maybe yes, maybe not.. depends on how the fairnessID handling will be done).

I dont think there's anything wrong with stubbing out a flag we intend to use. This was something that was suggested as a change to the API, the only reason it's not consumed is because another engineer is implementing flow control. Splitting up work allows us all to go faster?

Sure. let's move forward with this PR and we can address the other parts in separate PRs.
the PR LGTM.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 3, 2025
@k8s-ci-robot k8s-ci-robot merged commit f483650 into kubernetes-sigs:main Aug 3, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[InferenceModel update] Created fairness/identity header flag
3 participants