Skip to content

Commit 9e610b1

Browse files
committed
add WG AI Gateway
Signed-off-by: Shane Utt <[email protected]>
1 parent ca7e916 commit 9e610b1

File tree

8 files changed

+237
-0
lines changed

8 files changed

+237
-0
lines changed

OWNERS_ALIASES

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,13 @@ aliases:
130130
- mfahlandt
131131
- ritazh
132132
- terrytangyuan
133+
wg-ai-gateway-leads:
134+
- keithmattix
135+
- kflynn
136+
- kfswain
137+
- nirrozenbaum
138+
- shaneutt
139+
- xunzhuo
133140
wg-ai-integration-leads:
134141
- ardaguclu
135142
- rushmash91

liaisons.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ members will assume one of the departing members groups.
5555
| [SIG UI](sig-ui/README.md) | Maciej Szulik (**[@soltysh](https://github.com/soltysh)**) |
5656
| [SIG Windows](sig-windows/README.md) | Benjamin Elder (**[@BenTheElder](https://github.com/BenTheElder)**) |
5757
| [WG AI Conformance](wg-ai-conformance/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
58+
| [WG AI Gateway](wg-ai-gateway/README.md) | Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**) |
5859
| [WG AI Integration](wg-ai-integration/README.md) | Paco Xu 徐俊杰 (**[@pacoxu](https://github.com/pacoxu)**) |
5960
| [WG Batch](wg-batch/README.md) | Antonio Ojea (**[@aojea](https://github.com/aojea)**) |
6061
| [WG Data Protection](wg-data-protection/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |

sig-list.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ When the need arises, a [new SIG can be created](sig-wg-lifecycle.md)
6262
| Name | Label | Stakeholder SIGs |Organizers | Contact | Meetings |
6363
|------|-------|------------------|-----------|---------|----------|
6464
|[AI Conformance](wg-ai-conformance/README.md)|[ai-conformance](https://github.com/kubernetes/kubernetes/labels/wg%2Fai-conformance)|* Architecture<br>* Testing<br>|* [Janet Kuo](https://github.com/janetkuo), Google<br>* [Mario Fahlandt](https://github.com/mfahlandt), Kubermatic GmbH<br>* [Rita Zhang](https://github.com/ritazh), Microsoft<br>* [Yuan Tang](https://github.com/terrytangyuan), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-ai-conformance)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-ai-conformance)|* Regular WG Meeting: [Thursdays at 10:00 PT (Pacific Time) (weekly)]()<br>
65+
|[AI Gateway](wg-ai-gateway/README.md)|[ai-gateway](https://github.com/kubernetes/kubernetes/labels/wg%2Fai-gateway)|* Network<br>|* [Keith Mattix](https://github.com/keithmattix), Microsoft<br>* [Flynn](https://github.com/kflynn), Buoyant<br>* [Kellen Swain](https://github.com/kfswain), Google<br>* [Nir Rozenbaum](https://github.com/nirrozenbaum), IBM<br>* [Shane Utt](https://github.com/shaneutt), Red Hat<br>* [Xunzhuo](https://github.com/xunzhuo), Tencent<br>|* [Slack](https://kubernetes.slack.com/messages/wg-ai-gateway)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway)|* WG AI Gateway Bi-Weekly Meeting (Earlier Option): [Mondays at 12PM UTC (bi-weekly)]()<br>* WG AI Gateway Bi-Weekly Meeting (Later Option): [Thursdays at 6PM UTC (bi-weekly)]()<br>
6566
|[AI Integration](wg-ai-integration/README.md)|[ai-integration](https://github.com/kubernetes/kubernetes/labels/wg%2Fai-integration)|* API Machinery<br>* Apps<br>* Architecture<br>* Auth<br>* CLI<br>|* [Arda Guclu](https://github.com/ardaguclu), Red Hat<br>* [Arush Sharma](https://github.com/rushmash91), Amazon<br>* [Zvonko Kaiser](https://github.com/zvonkok), NVIDIA<br>|* [Slack](https://kubernetes.slack.com/messages/wg-ai-integration)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-ai-integration)|* WG AI Integration Weekly Meeting: [Wednesdays at 10:00 PT (Pacific Time) (weekly)]()<br>
6667
|[Batch](wg-batch/README.md)|[batch](https://github.com/kubernetes/kubernetes/labels/wg%2Fbatch)|* Apps<br>* Autoscaling<br>* Node<br>* Scheduling<br>|* [Kevin Hannon](https://github.com/kannon92), Red Hat<br>* [Marcin Wielgus](https://github.com/mwielgus), Google<br>* [Maciej Szulik](https://github.com/soltysh), Defense Unicorns<br>* [Swati Sehgal](https://github.com/swatisehgal), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-batch)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-batch)|* Regular Meeting ([calendar](https://calendar.google.com/calendar/embed?src=8ulop9k0jfpuo0t7kp8d9ubtj4%40group.calendar.google.com)): [Thursdays (starting February 15th 2024)s at 3PM CET (Central European Time) (monthly)](https://zoom.us/j/98329676612?pwd=c0N2bVV1aTh2VzltckdXSitaZXBKQT09)<br>
6768
|[Data Protection](wg-data-protection/README.md)|[data-protection](https://github.com/kubernetes/kubernetes/labels/wg%2Fdata-protection)|* Apps<br>* Storage<br>|* [Xing Yang](https://github.com/xing-yang), VMware<br>* [Xiangqian Yu](https://github.com/yuxiangqian), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-data-protection)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-data-protection)|* Regular WG Meeting: [Wednesdays at 9:00 PT (Pacific Time) (bi-weekly)](https://zoom.us/j/6933410772)<br>

sig-network/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
7373
## Working Groups
7474

7575
The following [working groups][working-group-definition] are sponsored by sig-network:
76+
* [WG AI Gateway](/wg-ai-gateway)
7677
* [WG Device Management](/wg-device-management)
7778
* [WG Node Lifecycle](/wg-node-lifecycle)
7879
* [WG Serving](/wg-serving)

sigs.yaml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3564,6 +3564,60 @@ workinggroups:
35643564
liaison:
35653565
github: pohly
35663566
name: Patrick Ohly
3567+
- dir: wg-ai-gateway
3568+
name: AI Gateway
3569+
mission_statement: >
3570+
The AI Gateway Working Group focuses on the intersection of AI and networking,
3571+
particularly in the context of extending load-balancer, gateway and proxy technologies
3572+
to manage and route traffic for AI Inference.
3573+
3574+
charter_link: charter.md
3575+
stakeholder_sigs:
3576+
- Network
3577+
label: ai-gateway
3578+
leadership:
3579+
chairs:
3580+
- github: keithmattix
3581+
name: Keith Mattix
3582+
company: Microsoft
3583+
3584+
- github: kflynn
3585+
name: Flynn
3586+
company: Buoyant
3587+
3588+
- github: kfswain
3589+
name: Kellen Swain
3590+
company: Google
3591+
3592+
- github: nirrozenbaum
3593+
name: Nir Rozenbaum
3594+
company: IBM
3595+
3596+
- github: shaneutt
3597+
name: Shane Utt
3598+
company: Red Hat
3599+
3600+
- github: xunzhuo
3601+
name: Xunzhuo
3602+
company: Tencent
3603+
3604+
meetings:
3605+
- description: WG AI Gateway Bi-Weekly Meeting (Earlier Option)
3606+
day: Monday
3607+
time: 12PM
3608+
tz: UTC
3609+
frequency: bi-weekly
3610+
- description: WG AI Gateway Bi-Weekly Meeting (Later Option)
3611+
day: Thursday
3612+
time: 6PM
3613+
tz: UTC
3614+
frequency: bi-weekly
3615+
contact:
3616+
slack: wg-ai-gateway
3617+
mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway
3618+
liaison:
3619+
github: justaugustus
3620+
name: Stephen Augustus
35673621
- dir: wg-ai-integration
35683622
name: AI Integration
35693623
mission_statement: >

wg-ai-gateway/OWNERS

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# See the OWNERS docs at https://go.k8s.io/owners
2+
3+
reviewers:
4+
- wg-ai-gateway-leads
5+
approvers:
6+
- wg-ai-gateway-leads
7+
labels:
8+
- wg/ai-gateway

wg-ai-gateway/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
<!---
2+
This is an autogenerated file!
3+
4+
Please do not edit this file directly, but instead make changes to the
5+
sigs.yaml file in the project root.
6+
7+
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
8+
--->
9+
# AI Gateway Working Group
10+
11+
The AI Gateway Working Group focuses on the intersection of AI and networking, particularly in the context of extending load-balancer, gateway and proxy technologies to manage and route traffic for AI Inference.
12+
13+
The [charter](charter.md) defines the scope and governance of the AI Gateway Working Group.
14+
15+
## Stakeholder SIGs
16+
* [SIG Network](/sig-network)
17+
18+
## Meetings
19+
*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway) for the group will typically add invites for the following meetings to your calendar.*
20+
* WG AI Gateway Bi-Weekly Meeting (Earlier Option): [Mondays at 12PM UTC]() (bi-weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=12PM&tz=UTC).
21+
* WG AI Gateway Bi-Weekly Meeting (Later Option): [Thursdays at 6PM UTC]() (bi-weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=6PM&tz=UTC).
22+
23+
## Organizers
24+
25+
* Keith Mattix (**[@keithmattix](https://github.com/keithmattix)**), Microsoft
26+
* Flynn (**[@kflynn](https://github.com/kflynn)**), Buoyant
27+
* Kellen Swain (**[@kfswain](https://github.com/kfswain)**), Google
28+
* Nir Rozenbaum (**[@nirrozenbaum](https://github.com/nirrozenbaum)**), IBM
29+
* Shane Utt (**[@shaneutt](https://github.com/shaneutt)**), Red Hat
30+
* Xunzhuo (**[@xunzhuo](https://github.com/xunzhuo)**), Tencent
31+
32+
## Contact
33+
- Slack: [#wg-ai-gateway](https://kubernetes.slack.com/messages/wg-ai-gateway)
34+
- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-ai-gateway)
35+
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fai-gateway)
36+
- Steering Committee Liaison: Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**)
37+
<!-- BEGIN CUSTOM CONTENT -->
38+
39+
<!-- END CUSTOM CONTENT -->

wg-ai-gateway/charter.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# WG AI Gateway Charter
2+
3+
This charter adheres to the conventions described in the [Kubernetes Charter
4+
README] and uses the Roles and Organization Management outlined in
5+
[wg-governance].
6+
7+
[wg-governance]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md
8+
[Kubernetes Charter README]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md
9+
10+
## Scope
11+
12+
The AI Gateway Working Group focuses on load-balancing, routing and related
13+
features that support networking for AI use cases. It also focuses on policies,
14+
filters, and extensions that support AI traffic management.
15+
16+
This working group will define terms like "AI Gateway" within the context of
17+
Kubernetes and key use cases for users and implementations. It will propose
18+
deliverables that need to be adopted in order to manage traffic for AI Inference
19+
on Kubernetes.
20+
21+
This comes at a time where there is a proliferation of "AI Gateways" being used
22+
for AI Inference, and a strong need for focus and collaboration to ensure
23+
standards around this space so that Kubernetes users get the features they need
24+
in a consistent way on the platform.
25+
26+
### In Scope
27+
28+
Overall guidance for the WG is to control scope as much as is feasible. The WG
29+
should avoid AI-specific functionality where it can: instead favoring the
30+
addition of provisions that help with AI networking and traffic management. In
31+
particular, the following is in scope:
32+
33+
* Providing definitions for networking related AI terms in a Kubernetes
34+
context.
35+
36+
* Defining important AI networking use-cases for Kubernetes users.
37+
38+
* Determining which common features and capabilities in the "AI Gateway" space
39+
need to be covered by Kubernetes standards and APIs according to user and
40+
implementation needs.
41+
42+
* Creating proposals for "AI Gateway" features and capabilities to the
43+
appropriate sub-projects.
44+
45+
* Propose new sub-projects if existing sub-projects are not sufficient.
46+
47+
### Out of Scope
48+
49+
* Developing whole "AI Gateway" solutions. This group will focus on
50+
enabling existing and new solutions to be more easily deployed and managed on
51+
Kubernetes, not adding any new production solutions maintained thereafter by
52+
upstream Kubernetes.
53+
54+
* Any specific kind of hardware support is generally out of scope.
55+
56+
* This group will not cover the entire spectrum of networking for AI. For
57+
instance: RDMA networks are generally out of scope.
58+
59+
* Model serving, and AI workloads are out of scope (see below for a caveat about
60+
this).
61+
62+
### Additional Scope Distinctions
63+
64+
There is a subtle distinction to be made when it comes to the scope of this WG
65+
for load-balancing and routing inference, particular when dealing with inference
66+
_workloads_: When the use case includes local model serving on the cluster, and
67+
routing and load-balancing features _rely on information from the inference
68+
workloads_, this kind of routing falls under the scope of WG Serving.
69+
70+
A good example of this is the [Gateway API Inference Extension (GIE)][gie].
71+
This project came from WG Serving and specifically handles advanced routing and
72+
load-balancing for inference which is informed by metrics and capabilities being
73+
advertised by the model serving platform (e.g. VLLM). In this vein, the GIE is
74+
effectively an alternative to the Kubernetes `Service` API, whereas this WG
75+
means to operate more at the `Gateway` and `HTTPRoute` level.
76+
77+
Use cases which have to interact with the model serving layer for networking
78+
(as described above) are generally out of scope for this WG. If some feature
79+
the WG is working on absolutely must cross this line, the effort MUST be brought
80+
to WG Serving and worked on as a joint effort with them.
81+
82+
[gie]:https://github.com/kubernetes-sigs/gateway-api-inference-extension
83+
84+
## Deliverables
85+
86+
* A compendium of AI related networking definitions (e.g. "AI Gateway") and a
87+
key use-cases for Kubernetes users.
88+
89+
* Provide a space for collaboration and experimentation to determine the most
90+
viable features and capabilities that Kubernetes should support. If there is
91+
strong consensus on any particular ideas, the WG will facilitate and
92+
coordinate the delivery of proposals in the appropriate areas.
93+
94+
## Stakeholders
95+
96+
* SIG Network
97+
98+
### Related WGs
99+
100+
* WG Serving - The domain of WG Serving is AI Workloads, which can be served by
101+
some of the networking support we want to add. When we have proposals that
102+
are strongly relevant to serving, we will loop them in so they can provide
103+
feedback.
104+
105+
## Roles and Organization Management
106+
107+
This working group adheres to the Roles and Organization Management outlined in
108+
[wg-governance] and opts-in to updates and modifications to [wg-governance].
109+
110+
[wg-governance]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md
111+
112+
## Exit Criteria
113+
114+
The WG is done when its deliverables are complete, according to the defined
115+
scope and a list of key use cases and features agreed upon by the group.
116+
117+
Ideally we want the lifecycle of the WG to go something like this:
118+
119+
1. Determine definitions and key use cases for Kubernetes users and
120+
implementations, and document those.
121+
2. Determine a list of key features that Kubernetes needs to best support the
122+
defined use cases.
123+
3. For each feature in that list, make proposals which support them to the
124+
appropriate sub-projects OR propose new sub-projects if deemed necessary.
125+
4. Once the feature list is complete, leave behind some guidance and best
126+
practices for future implementations and then exit.

0 commit comments

Comments
 (0)