Skip to content

Conversation

shaneutt
Copy link
Member

@shaneutt shaneutt commented Jul 18, 2025

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 18, 2025
@k8s-ci-robot k8s-ci-robot requested a review from Xunzhuo July 18, 2025 17:49
@k8s-ci-robot
Copy link
Contributor

@shaneutt: GitHub didn't allow me to request PR reviews from the following users: david-martin, kflynn, rootfs, yuzisun.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This PR requests the creation of the "AI Gateway Working Group" as discussed and defined throughout:

/cc @david-martin @keithmattix @kflynn @kfswain @nirrozenbaum @rootfs @Xunzhuo @yuzisun
Thank you all for volunteering to help lead this group!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shaneutt
Once this PR has been reviewed and has the lgtm label, please assign kaslin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added committee/steering Denotes an issue or PR intended to be handled by the steering committee. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/network Categorizes an issue or PR as relevant to SIG Network. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. labels Jul 18, 2025
@shaneutt shaneutt force-pushed the ai-gw-wg branch 2 times, most recently from 37481e4 to 84d372b Compare July 18, 2025 18:00
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Jul 18, 2025
@cblecker
Copy link
Member

/hold
for review

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 18, 2025
@shaneutt shaneutt requested a review from aojea July 25, 2025 12:20
@shaneutt shaneutt force-pushed the ai-gw-wg branch 3 times, most recently from 9c05bc4 to ea77fe1 Compare July 29, 2025 19:56
@nirrozenbaum
Copy link

@aojea a kind reminder...
are there any additional changes needed?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025
--->
# AI Gateway Working Group

The AI Gateway Working Group focuses on the intersection of AI and networking, particularly in the context of extending load-balancer, gateway and proxy technologies to manage and route traffic for AI Inference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not "particularly" in the gateway context, it's specifically/exclusively in that context, isn't it?

Saying the WG "focuses on the intersection of AI and networking" makes it sound like DRA and maybe MCP would be in-scope too.

EDIT: OMG, I just realized that even Gateway API Inference Extension is defined as being out of scope for this WG. I was assuming that was the main focus of the WG.

You need to be about 10,000% more explicit here (and in the charter) about what the WG is doing. I had to read the google doc to figure it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the feedback Dan. I took a stab at changing some of the language in a way I hoped would align better with the original doc, and help clarify things. I specifically removed the text you quoted. Let me know what you think. However if you still think it needs more work, if you could point out all the exact text you find confusing, or suggest explicit things that need to be added in your opinion, that would be greatly appreciated as I did struggle a bit with where I should be focusing.

Copy link
Contributor

@danwinship danwinship Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that scope, I would still assume that inference extension was in-scope for this WG, given that it is very much part of "load-balancing, routing and related features that support networking for AI use cases" and "policies,
filters, and extensions that support AI traffic management".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at the end of the day it's not that the GIE is completely and totally out of scope, in fact this WG in theory might make proposals to it, I think it's more that as soon as we would touch anything that's model serving to support networking functionality, we have to work with WG serving. I don't know if there's a fantastic way to describe this nuance in a few words, which is why we have the section below. Can we move forward with that section trying to explain the nuance? Or do you think we still need changes to the high level description? Do you have any suggestions on what would read better to you?

Copy link
Contributor

@danwinship danwinship Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"what would read better" is the text in the google doc.

I don't have any suggestions on how to make that multi-paragraph explanation shorter, but "it's hard to explain well what our WG is for" is not a good reason for explaining it badly.

Even just saying "the scope is everything involving Gateway API and AI, except for Gateway API Inference Extension, which is out of scope" would be an improvement. Because otherwise, basically everyone is going to assume that the one existing AI-related k8s feature with "Gateway" in its name is supposed to be part of the "AI Gateway" WG... (Right?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright thank you for the feedback, it's slightly modified but I worked in the bulk of the language from the document here in hopes of that providing more clarity. LMKWYT! 🖖

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's great and makes it much clearer
+1 on chartering and SIG Network sponsorship

@skitt
Copy link
Member

skitt commented Aug 25, 2025

  1. Figure out if SIG-MC is a sponsor (#8521 (comment))

@lauralorenz has SIG MC come to a decision in this regard?

I’ve put this on the agenda for our SIG call tomorrow and I’m aiming for a decision by Wednesday.

@shaneutt
Copy link
Member Author

As I'm looking at this proposal the following bits are remaining:

1. Figure out if SIG-MC is a sponsor ([#8521 (comment)](https://github.com/kubernetes/community/pull/8521#issuecomment-3180239834))

2. All sponsoring SIGs approve the WG creation.

3. Address the comment about LB overlap with wg-serving ([#8521 (comment)](https://github.com/kubernetes/community/pull/8521#discussion_r2286690800))

4. Fill in the liaison.

@soltysh we now have another SIG Network +1.

All that remains is SIG MC's decision. If they decide to join we'll incorporate scope updates to reflect that (the language should remain familiar as its still strongly networking focused and the "egress use case" is already relevant in a multi-cluster scenario).

Let us know if there's anything else you'd like from us, otherwise we'll just plan on checking back in Wednesday. 🖖

@MikeZappa87
Copy link
Contributor

Thanks @shaneutt for clarifying the scope. +1 looks good and looking forward to this one.

@soltysh
Copy link
Contributor

soltysh commented Aug 27, 2025

Let us know if there's anything else you'd like from us, otherwise we'll just plan on checking back in Wednesday. 🖖

There are some more questions I left just now, but correct the major blocker is SIG Multicluster decision. After that I'll poke the steering for final decision.

@skitt
Copy link
Member

skitt commented Aug 27, 2025

SIG-Multicluster is willing to sponsor the working group. We’re particularly keen to avoid duplicating effort and learning from past endeavours, for example around ClusterProfile (for cluster identification and exposing cluster properties), or exposing services (including through Gateway API). There are in-progress efforts in the SIG which could be relevant to the WG, for example the placement decision KEP, KEP-5313.

@shaneutt
Copy link
Member Author

SIG-Multicluster is willing to sponsor the working group. We’re particularly keen to avoid duplicating effort and learning from past endeavours, for example around ClusterProfile (for cluster identification and exposing cluster properties), or exposing services (including through Gateway API). There are in-progress efforts in the SIG which could be relevant to the WG, for example the placement decision KEP, KEP-5313.

Great! We've added multi-cluster as in-scope in the charter and marked you as a stakeholder, thanks for the update! 🎉

@shaneutt
Copy link
Member Author

@soltysh thanks for the continued review and feedback. At this point we think we have everything covered and are ready to go, please let us know if there's anything else we need to get done first! 🖖

Signed-off-by: Shane Utt <[email protected]>
@k8s-ci-robot k8s-ci-robot added the sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. label Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.