-
Notifications
You must be signed in to change notification settings - Fork 5.3k
add WG AI Gateway #8521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
add WG AI Gateway #8521
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3561,6 +3561,57 @@ workinggroups: | |
liaison: | ||
github: pohly | ||
name: Patrick Ohly | ||
- dir: wg-ai-gateway | ||
name: AI Gateway | ||
mission_statement: > | ||
The AI Gateway Working Group focuses on the intersection of AI and networking, | ||
particularly in the context of extending load-balancer, gateway and proxy technologies | ||
to manage and route traffic for AI Inference. | ||
|
||
charter_link: charter.md | ||
stakeholder_sigs: | ||
- Network | ||
label: ai-gateway | ||
leadership: | ||
chairs: | ||
- github: keithmattix | ||
name: Keith Mattix | ||
company: Microsoft | ||
email: [email protected] | ||
- github: kflynn | ||
name: Flynn | ||
company: Buoyant | ||
email: [email protected] | ||
- github: kfswain | ||
name: Kellen Swain | ||
company: Google | ||
email: [email protected] | ||
- github: nirrozenbaum | ||
name: Nir Rozenbaum | ||
company: IBM | ||
email: [email protected] | ||
- github: shaneutt | ||
name: Shane Utt | ||
company: Red Hat | ||
email: [email protected] | ||
- github: xunzhuo | ||
name: Xunzhuo | ||
company: Tencent | ||
email: [email protected] | ||
meetings: | ||
- description: WG AI Gateway Bi-Weekly Meeting (Earlier Option) | ||
day: Monday | ||
time: 12PM | ||
tz: UTC | ||
frequency: bi-weekly | ||
- description: WG AI Gateway Bi-Weekly Meeting (Later Option) | ||
day: Thursday | ||
time: 6PM | ||
tz: UTC | ||
frequency: bi-weekly | ||
contact: | ||
slack: wg-ai-gateway | ||
mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway | ||
- dir: wg-ai-integration | ||
name: AI Integration | ||
mission_statement: > | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# See the OWNERS docs at https://go.k8s.io/owners | ||
|
||
reviewers: | ||
- wg-ai-gateway-leads | ||
approvers: | ||
- wg-ai-gateway-leads | ||
labels: | ||
- wg/ai-gateway |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<!--- | ||
This is an autogenerated file! | ||
Please do not edit this file directly, but instead make changes to the | ||
sigs.yaml file in the project root. | ||
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md | ||
---> | ||
# AI Gateway Working Group | ||
|
||
The AI Gateway Working Group focuses on the intersection of AI and networking, particularly in the context of extending load-balancer, gateway and proxy technologies to manage and route traffic for AI Inference. | ||
|
||
The [charter](charter.md) defines the scope and governance of the AI Gateway Working Group. | ||
|
||
## Stakeholder SIGs | ||
* [SIG Network](/sig-network) | ||
|
||
## Meetings | ||
*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway) for the group will typically add invites for the following meetings to your calendar.* | ||
* WG AI Gateway Bi-Weekly Meeting (Earlier Option): [Mondays at 12PM UTC]() (bi-weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=12PM&tz=UTC). | ||
* WG AI Gateway Bi-Weekly Meeting (Later Option): [Thursdays at 6PM UTC]() (bi-weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=6PM&tz=UTC). | ||
|
||
## Organizers | ||
|
||
* Keith Mattix (**[@keithmattix](https://github.com/keithmattix)**), Microsoft | ||
* Flynn (**[@kflynn](https://github.com/kflynn)**), Buoyant | ||
* Kellen Swain (**[@kfswain](https://github.com/kfswain)**), Google | ||
* Nir Rozenbaum (**[@nirrozenbaum](https://github.com/nirrozenbaum)**), IBM | ||
* Shane Utt (**[@shaneutt](https://github.com/shaneutt)**), Red Hat | ||
* Xunzhuo (**[@xunzhuo](https://github.com/xunzhuo)**), Tencent | ||
|
||
## Contact | ||
- Slack: [#wg-ai-gateway](https://kubernetes.slack.com/messages/wg-ai-gateway) | ||
- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway) | ||
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fai-gateway) | ||
<!-- BEGIN CUSTOM CONTENT --> | ||
|
||
<!-- END CUSTOM CONTENT --> |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,101 @@ | ||||||
# WG AI Gateway Charter | ||||||
|
||||||
This charter adheres to the conventions described in the [Kubernetes Charter | ||||||
README] and uses the Roles and Organization Management outlined in | ||||||
[wg-governance]. | ||||||
|
||||||
[wg-governance]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md | ||||||
[Kubernetes Charter README]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md | ||||||
|
||||||
## Scope | ||||||
|
||||||
The AI Gateway Working Group focuses on the intersection of AI and | ||||||
networking, particularly in the context of extending load-balancer, gateway | ||||||
and proxy technologies to manage and route traffic for AI Inference. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m wondering if this WG has some overlap with WG Serving. community/wg-serving/charter.md Lines 37 to 38 in bfee8f7
Could you help clarify the scope differences? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this case it's squarely focused on the networking aspects of inference, not the compute or lifecycle management aspects as in the case of wg-serving. The key point is this one:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, but why the existing WG Serving isn't the right place to cover those topics? The biggest issue as I'm seeing is that you will affect WG Serving work, only if focusing on the networking side of it. So either explicitly listing out the reasons for that extraction or how are you going to collaborate with WG Serving in that scope will be necessary to ensure the work doesn't diverge. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
WG Serving's charter makes it clear that it is focused broadly on serving workloads as the primary objective, and its goals speak directly to that. Notably, the stated goals do not include any networking specific deliverables. This WG is focused very tightly on traffic and API management, going as far as focusing on very specific individual features in that domain that we want to explore (see the document from the description). While it may seem plausible for any working group to claim that it is the suitable forum for networking-related discussions, this perspective does not hold when we move beyond standard networking to address protocol or domain-specific networking, such as in use cases like this. In these instances, it is essential to engage specialists from the community and provide room for dedicated focus. It is because of this technical specificity, the need to engage more people and grow our community and the need for autonomy and focus that WG Serving is not the right place to cover these topics. Additionally, one of our primary use cases (see the "Why?" section of our originating document) covers the situation where users want to perform inference from their Kubernetes applications, but they will be reaching outside the cluster to do it. This use case is effectively an egress use case, and is definitively networking, and is entirely out of scope for WG Serving.
I've added WG Serving as an explicit collaborator which we will need to keep looped in for review on any of our proposals that deal explicitly with model serving backends. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. | This use case is effectively an egress networking use case and is entirely out of scope for WG Serving. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to the above: serving is just one piece of the AI networking story; those users have their own models and are doing local inference. IMO, far more users are going to want to consume remote models and apply policy on that consumption There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Completely agree with the above comments, there are various topics in the scope of network traffic and API management that don't feel natural to WG serving and require a dedicated WG with focus on networking. |
||||||
|
||||||
This working group will define terms like "AI Gateway" within the context of | ||||||
Kubernetes and key use cases for users and implementations. It will propose | ||||||
deliverables that need to be adopted in order to serve AI Inference on | ||||||
Kubernetes. | ||||||
|
||||||
This comes at a time where there is a proliferation of "AI Gateways" being used | ||||||
for AI Inference, and a strong need for focus and collaboration to ensure | ||||||
standards around this space so that Kubernetes users get the features they need | ||||||
in a consistent way on the platform. | ||||||
|
||||||
### In Scope | ||||||
|
||||||
Overall guidance for the WG is to control scope as much as is feasible. The WG | ||||||
should avoid AI-specific functionality where it can: instead favoring the | ||||||
addition of provisions that help with AI use-cases, but are otherwise normal | ||||||
networking facilities. Under that guidance, the following is in-scope: | ||||||
|
||||||
* Providing definitions for networking related AI terms in a Kubernetes | ||||||
context. | ||||||
|
||||||
* Defining important AI networking use-cases for Kubernetes users. | ||||||
|
||||||
* Determining which common features and capabilities in the "AI Gateway" space | ||||||
need to be covered by Kubernetes standards and APIs according to user and | ||||||
implementation needs. | ||||||
|
||||||
* Creating proposals for "AI Gateway" features and capabilities to the | ||||||
appropriate sub-projects. | ||||||
|
||||||
* Propose new sub-projects if existing sub-projects are not sufficient. | ||||||
|
||||||
### Out of Scope | ||||||
|
||||||
* Developing whole "AI Gateway" solutions. This group will focus on | ||||||
enabling existing and new solutions to be more easily deployed and managed on | ||||||
Kubernetes, not adding any new production solutions maintained thereafter by | ||||||
upstream Kubernetes. | ||||||
|
||||||
* Any specific kind of hardware support is generally out of scope. | ||||||
|
||||||
* This group will not cover the entire spectrum of networking for AI. For | ||||||
instance: RDMA networks are generally out of scope. | ||||||
|
||||||
## Deliverables | ||||||
|
||||||
* A compendium of AI related networking definitions (e.g. "AI Gateway") and a | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this some sort of stored artifact? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Documentation somewhere with some definitions, where exactly TBD. |
||||||
key use-cases for Kubernetes users. | ||||||
|
||||||
* Provide a space for collaboration and experimentation to determine the most | ||||||
viable features and capabilities that Kubernetes should support. If there is | ||||||
strong consensus on any particular ideas, the WG will facilitate and | ||||||
coordinate the delivery of proposals in the appropriate areas. | ||||||
|
||||||
## Stakeholders | ||||||
|
||||||
* SIG Network | ||||||
|
||||||
### Related WGs | ||||||
|
||||||
* WG Serving - The domain of WG Serving is AI Workloads, which can be served by | ||||||
some of the networking support we want to add. When we have proposals that | ||||||
are strongly relevant to serving, we will loop them in so they can provide | ||||||
feedback. | ||||||
|
||||||
## Roles and Organization Management | ||||||
|
||||||
This working group adheres to the Roles and Organization Management outlined in | ||||||
[wg-governance] and opts-in to updates and modifications to [wg-governance]. | ||||||
|
||||||
[wg-governance]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md | ||||||
|
||||||
## Exit Criteria | ||||||
|
||||||
The WG is done when its deliverables are complete, according to the defined | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you considered a SIG-Network sub-project? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. In the originating document for this working group, we noted the potential for existing subprojects to house proposals generated by this group and suggested that we might even propose new subprojects. We feel there's discussion to be had and consensus to be built first. Notably, as it pertains to this, we are trying to be very deliberate about an exit for this WG. We've seen long-running working groups and we don't want that for ourselves. We endeavor to deliver on our goals and disband within the next year. We think it's likely that conclusion could be a new subproject, but it may instead be multiple proposals to existing projects across multiple SIGs, thus why we feel a working group is appropriate. |
||||||
scope and a list of key use cases and features agreed upon by the group. | ||||||
|
||||||
Ideally we want the lifecycle of the WG to go something like this: | ||||||
|
||||||
1. Determine definitions and key use cases for Kubernetes users and | ||||||
implementations, and document those. | ||||||
2. Determine a list of key features that Kubernetes needs to best support the | ||||||
defined use cases. | ||||||
3. For each feature in that list, make proposals which support them to the | ||||||
appropriate sub-projects OR propose new sub-projects if deemed necessary. | ||||||
shaneutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
4. Once the feature list is complete, leave behind some guidance and best | ||||||
practices for future implementations and then exit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've raised a similar concern when reviewing WG Node Lifecycle, are you sure you want to have that many leads? I know from my personal experience that having too many makes it sometimes challenging. Definitely not a blocker for WG creation, more like a suggestion 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, our list is big, but that is reflective of a large wave of interest. Each person on this list is representative of some technical aspect of the subject matter which makes them specialists/experts for the group. Each of them has spoken with me personally and I'm confident in their commitment to dedicate substantial time to the project, including attending and leading meetings, as well as actively driving and contributing to proposals.