add WG AI Gateway #8521

shaneutt · 2025-07-18T17:49:08Z

This PR requests the creation of the "AI Gateway Working Group" as discussed and defined throughout:

k8s-ci-robot · 2025-07-18T17:49:15Z

@shaneutt: GitHub didn't allow me to request PR reviews from the following users: david-martin, kflynn, rootfs, yuzisun.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This PR requests the creation of the "AI Gateway Working Group" as discussed and defined throughout:

https://groups.google.com/g/kubernetes-sig-network/c/j50pypPlLSk

https://groups.google.com/a/kubernetes.io/g/dev/c/XC_8qAyk8W0

https://docs.google.com/document/d/10WTdHYW5x2rw6BTgDzW7X-5QNesAh205MuoaUe5-IQg

/cc @david-martin @keithmattix @kflynn @kfswain @nirrozenbaum @rootfs @Xunzhuo @yuzisun
Thank you all for volunteering to help lead this group!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-07-18T17:49:16Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shaneutt
Once this PR has been reviewed and has the lgtm label, please assign kaslin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cblecker · 2025-07-18T18:19:54Z

/hold
for review

liaisons.md

aojea · 2025-07-25T08:20:53Z

wg-ai-gateway/charter.md

+
+## Deliverables
+
+* A compendium of AI related networking definitions (e.g. "AI Gateway") and a


is this some sort of stored artifact?

Yes. Documentation somewhere with some definitions, where exactly TBD.

wg-ai-gateway/charter.md

nirrozenbaum · 2025-08-03T08:06:22Z

@aojea a kind reminder...
are there any additional changes needed?

wg-ai-gateway/charter.md

pacoxu · 2025-08-08T06:01:33Z

wg-ai-gateway/charter.md

+
+The AI Gateway Working Group focuses on the intersection of AI and
+networking, particularly in the context of extending load-balancer, gateway
+and proxy technologies to manage and route traffic for AI Inference.


I’m wondering if this WG has some overlap with WG Serving.

community/wg-serving/charter.md

Lines 37 to 38 in bfee8f7

- Explore new projects that improve orchestration, scaling, and load balancing

of inference workloads and compose well with other workloads on Kubernetes

Could you help clarify the scope differences?

In this case it's squarely focused on the networking aspects of inference, not the compute or lifecycle management aspects as in the case of wg-serving. The key point is this one:

manage and route traffic for AI Inference.

Right, but why the existing WG Serving isn't the right place to cover those topics? The biggest issue as I'm seeing is that you will affect WG Serving work, only if focusing on the networking side of it. So either explicitly listing out the reasons for that extraction or how are you going to collaborate with WG Serving in that scope will be necessary to ensure the work doesn't diverge.

why the existing WG Serving isn't the right place to cover those topics?

WG Serving's charter makes it clear that it is focused broadly on serving workloads as the primary objective, and its goals speak directly to that. Notably, the stated goals do not include any networking specific deliverables.

This WG is focused very tightly on traffic and API management, going as far as focusing on very specific individual features in that domain that we want to explore (see the document from the description).

While it may seem plausible for any working group to claim that it is the suitable forum for networking-related discussions, this perspective does not hold when we move beyond standard networking to address protocol or domain-specific networking, such as in use cases like this. In these instances, it is essential to engage specialists from the community and provide room for dedicated focus. It is because of this technical specificity, the need to engage more people and grow our community and the need for autonomy and focus that WG Serving is not the right place to cover these topics.

Additionally, one of our primary use cases (see the "Why?" section of our originating document) covers the situation where users want to perform inference from their Kubernetes applications, but they will be reaching outside the cluster to do it. This use case is effectively an egress use case, and is definitively networking, and is entirely out of scope for WG Serving.

So either explicitly listing out the reasons for that extraction or how are you going to collaborate with WG Serving in that scope will be necessary to ensure the work doesn't diverge.

I've added WG Serving as an explicit collaborator which we will need to keep looped in for review on any of our proposals that deal explicitly with model serving backends.

/cc @ArangoGutierrez @SergeyKanzhelev @terrytangyuan

| This use case is effectively an egress networking use case and is entirely out of scope for WG Serving.
+1 that. I think WG Serving will remain a collaborator for any work that touches model-serving backends, but the proposed effort focuses on egress networking for inference from Kubernetes apps, which is out of its scope. The proposal looks promising with SIG MultiCluster for cross-cluster traffic policy and failover, SIG Apps for app-facing APIs and workload integration, and SIG Auth for identity, authentication, and policy on egress.

+1 to the above: serving is just one piece of the AI networking story; those users have their own models and are doing local inference. IMO, far more users are going to want to consume remote models and apply policy on that consumption

Completely agree with the above comments, there are various topics in the scope of network traffic and API management that don't feel natural to WG serving and require a dedicated WG with focus on networking.
Obviously we do plan to collaborate closely with serving WG to make sure the scope of each WG is well defined and complementary to each other.

soltysh · 2025-08-12T09:44:01Z

sigs.yaml

+    - github: xunzhuo
+      name: Xunzhuo
+      company: Tencent
+      email: [email protected]


I've raised a similar concern when reviewing WG Node Lifecycle, are you sure you want to have that many leads? I know from my personal experience that having too many makes it sometimes challenging. Definitely not a blocker for WG creation, more like a suggestion 😉

Indeed, our list is big, but that is reflective of a large wave of interest. Each person on this list is representative of some technical aspect of the subject matter which makes them specialists/experts for the group. Each of them has spoken with me personally and I'm confident in their commitment to dedicate substantial time to the project, including attending and leading meetings, as well as actively driving and contributing to proposals.

soltysh · 2025-08-12T09:47:38Z

wg-ai-gateway/charter.md

+
+The AI Gateway Working Group focuses on the intersection of AI and
+networking, particularly in the context of extending load-balancer, gateway
+and proxy technologies to manage and route traffic for AI Inference.


Right, but why the existing WG Serving isn't the right place to cover those topics? The biggest issue as I'm seeing is that you will affect WG Serving work, only if focusing on the networking side of it. So either explicitly listing out the reasons for that extraction or how are you going to collaborate with WG Serving in that scope will be necessary to ensure the work doesn't diverge.

soltysh · 2025-08-12T09:52:14Z

wg-ai-gateway/charter.md

+
+## Exit Criteria
+
+The WG is done when its deliverables are complete, according to the defined


Have you considered a SIG-Network sub-project?

Yes. In the originating document for this working group, we noted the potential for existing subprojects to house proposals generated by this group and suggested that we might even propose new subprojects. We feel there's discussion to be had and consensus to be built first.

Notably, as it pertains to this, we are trying to be very deliberate about an exit for this WG. We've seen long-running working groups and we don't want that for ourselves. We endeavor to deliver on our goals and disband within the next year. We think it's likely that conclusion could be a new subproject, but it may instead be multiple proposals to existing projects across multiple SIGs, thus why we feel a working group is appropriate.

Signed-off-by: Shane Utt <[email protected]>

lauralorenz · 2025-08-12T17:05:04Z

Just wanted to pop in here for posterity that SIG-MC has been approached as a potential stakeholder SIG and is discussing what/who we can commit to and whether it makes sense within the scope of the current goals. In particular we have been made aware of a potential use case for the WG related to egress for inference endpoints to k8s or non-k8s inference endpoints. cc @skitt @JeremyOT

k8s-ci-robot requested review from keithmattix, kfswain and nirrozenbaum July 18, 2025 17:49

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 18, 2025

k8s-ci-robot requested a review from Xunzhuo July 18, 2025 17:49

shaneutt force-pushed the ai-gw-wg branch 2 times, most recently from 37481e4 to 84d372b Compare July 18, 2025 18:00

k8s-ci-robot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Jul 18, 2025

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 18, 2025

shaneutt force-pushed the ai-gw-wg branch from 84d372b to 6c04ac3 Compare July 18, 2025 18:20

pacoxu reviewed Jul 22, 2025

View reviewed changes

liaisons.md Outdated Show resolved Hide resolved

shaneutt force-pushed the ai-gw-wg branch from 6c04ac3 to 033979e Compare July 22, 2025 12:42

shaneutt requested a review from pacoxu July 22, 2025 12:42

aojea reviewed Jul 25, 2025

View reviewed changes

wg-ai-gateway/charter.md Show resolved Hide resolved

shaneutt requested a review from aojea July 25, 2025 12:20

shaneutt force-pushed the ai-gw-wg branch 3 times, most recently from 9c05bc4 to ea77fe1 Compare July 29, 2025 19:56

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025

shaneutt force-pushed the ai-gw-wg branch from ea77fe1 to 31cf612 Compare August 5, 2025 17:34

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025

npolshakova reviewed Aug 5, 2025

View reviewed changes

wg-ai-gateway/charter.md Outdated Show resolved Hide resolved

shaneutt force-pushed the ai-gw-wg branch from 5d5b48b to 9e5df55 Compare August 5, 2025 20:31

pacoxu reviewed Aug 8, 2025

View reviewed changes

soltysh reviewed Aug 12, 2025

View reviewed changes

shaneutt requested review from soltysh, pacoxu and npolshakova August 12, 2025 12:28

add WG AI Gateway

f751ccc

Signed-off-by: Shane Utt <[email protected]>

shaneutt force-pushed the ai-gw-wg branch from 9e5df55 to f751ccc Compare August 12, 2025 12:34

k8s-ci-robot requested review from ArangoGutierrez, SergeyKanzhelev and terrytangyuan August 12, 2025 12:35


		## Deliverables

		* A compendium of AI related networking definitions (e.g. "AI Gateway") and a

	- Explore new projects that improve orchestration, scaling, and load balancing
	of inference workloads and compose well with other workloads on Kubernetes


		## Exit Criteria

		The WG is done when its deliverables are complete, according to the defined

add WG AI Gateway #8521

Are you sure you want to change the base?

add WG AI Gateway #8521

Conversation

shaneutt commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jul 18, 2025

Uh oh!

k8s-ci-robot commented Jul 18, 2025

Uh oh!

cblecker commented Jul 18, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shaneutt Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nirrozenbaum commented Aug 3, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shaneutt Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lauralorenz commented Aug 12, 2025

Uh oh!

Uh oh!

shaneutt commented Jul 18, 2025 •

edited

Loading

shaneutt Jul 25, 2025 •

edited

Loading

shaneutt Aug 12, 2025 •

edited

Loading