|
| 1 | +# WG AI Gateway Charter |
| 2 | + |
| 3 | +This charter adheres to the conventions described in the [Kubernetes Charter |
| 4 | +README] and uses the Roles and Organization Management outlined in |
| 5 | +[wg-governance]. |
| 6 | + |
| 7 | +[wg-governance]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md |
| 8 | +[Kubernetes Charter README]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md |
| 9 | + |
| 10 | +## Scope |
| 11 | + |
| 12 | +The AI Gateway Working Group focuses on load-balancing, routing and related |
| 13 | +features that support networking for AI use cases. It also focuses on policies, |
| 14 | +filters, and extensions that support AI traffic management. |
| 15 | + |
| 16 | +This working group will define terms like "AI Gateway" within the context of |
| 17 | +Kubernetes and key use cases for users and implementations. It will propose |
| 18 | +deliverables that need to be adopted in order to manage traffic for AI Inference |
| 19 | +on Kubernetes. |
| 20 | + |
| 21 | +This comes at a time where there is a proliferation of "AI Gateways" being used |
| 22 | +for AI Inference, and a strong need for focus and collaboration to ensure |
| 23 | +standards around this space so that Kubernetes users get the features they need |
| 24 | +in a consistent way on the platform. |
| 25 | + |
| 26 | +### In Scope |
| 27 | + |
| 28 | +Overall guidance for the WG is to control scope as much as is feasible. The WG |
| 29 | +should avoid AI-specific functionality where it can: instead favoring the |
| 30 | +addition of provisions that help with AI networking and traffic management. In |
| 31 | +particular, the following is in scope: |
| 32 | + |
| 33 | +* Providing definitions for networking related AI terms in a Kubernetes |
| 34 | + context. |
| 35 | + |
| 36 | +* Defining important AI networking use-cases for Kubernetes users. |
| 37 | + |
| 38 | +* Determining which common features and capabilities in the "AI Gateway" space |
| 39 | + need to be covered by Kubernetes standards and APIs according to user and |
| 40 | + implementation needs. |
| 41 | + |
| 42 | +* Creating proposals for "AI Gateway" features and capabilities to the |
| 43 | + appropriate sub-projects. |
| 44 | + |
| 45 | +* Propose new sub-projects if existing sub-projects are not sufficient. |
| 46 | + |
| 47 | +### Out of Scope |
| 48 | + |
| 49 | +* Developing whole "AI Gateway" solutions. This group will focus on |
| 50 | + enabling existing and new solutions to be more easily deployed and managed on |
| 51 | + Kubernetes, not adding any new production solutions maintained thereafter by |
| 52 | + upstream Kubernetes. |
| 53 | + |
| 54 | +* Any specific kind of hardware support is generally out of scope. |
| 55 | + |
| 56 | +* This group will not cover the entire spectrum of networking for AI. For |
| 57 | + instance: RDMA networks are generally out of scope. |
| 58 | + |
| 59 | +* Model serving, and AI workloads are out of scope (see below for a caveat about |
| 60 | + this). |
| 61 | + |
| 62 | +### Additional Scope Distinctions |
| 63 | + |
| 64 | +There is a subtle distinction to be made when it comes to the scope of this WG |
| 65 | +for load-balancing and routing inference, particular when dealing with inference |
| 66 | +_workloads_: When the use case includes local model serving on the cluster, and |
| 67 | +routing and load-balancing features _rely on information from the inference |
| 68 | +workloads_, this kind of routing falls under the scope of WG Serving. |
| 69 | + |
| 70 | +A good example of this is the [Gateway API Inference Extension (GIE)][gie]. |
| 71 | +This project came from WG Serving and specifically handles advanced routing and |
| 72 | +load-balancing for inference which is informed by metrics and capabilities being |
| 73 | +advertised by the model serving platform (e.g. VLLM). In this vein, the GIE is |
| 74 | +effectively an alternative to the Kubernetes `Service` API, whereas this WG |
| 75 | +means to operate more at the `Gateway` and `HTTPRoute` level. |
| 76 | + |
| 77 | +Use cases which have to interact with the model serving layer for networking |
| 78 | +(as described above) are generally out of scope for this WG. If some feature |
| 79 | +the WG is working on absolutely must cross this line, the effort MUST be brought |
| 80 | +to WG Serving and worked on as a joint effort with them. |
| 81 | + |
| 82 | +[gie]:https://github.com/kubernetes-sigs/gateway-api-inference-extension |
| 83 | + |
| 84 | +## Deliverables |
| 85 | + |
| 86 | +* A compendium of AI related networking definitions (e.g. "AI Gateway") and a |
| 87 | + key use-cases for Kubernetes users. |
| 88 | + |
| 89 | +* Provide a space for collaboration and experimentation to determine the most |
| 90 | + viable features and capabilities that Kubernetes should support. If there is |
| 91 | + strong consensus on any particular ideas, the WG will facilitate and |
| 92 | + coordinate the delivery of proposals in the appropriate areas. |
| 93 | + |
| 94 | +## Stakeholders |
| 95 | + |
| 96 | +* SIG Network |
| 97 | + |
| 98 | +### Related WGs |
| 99 | + |
| 100 | +* WG Serving - The domain of WG Serving is AI Workloads, which can be served by |
| 101 | + some of the networking support we want to add. When we have proposals that |
| 102 | + are strongly relevant to serving, we will loop them in so they can provide |
| 103 | + feedback. |
| 104 | + |
| 105 | +## Roles and Organization Management |
| 106 | + |
| 107 | +This working group adheres to the Roles and Organization Management outlined in |
| 108 | +[wg-governance] and opts-in to updates and modifications to [wg-governance]. |
| 109 | + |
| 110 | +[wg-governance]:https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md |
| 111 | + |
| 112 | +## Exit Criteria |
| 113 | + |
| 114 | +The WG is done when its deliverables are complete, according to the defined |
| 115 | +scope and a list of key use cases and features agreed upon by the group. |
| 116 | + |
| 117 | +Ideally we want the lifecycle of the WG to go something like this: |
| 118 | + |
| 119 | +1. Determine definitions and key use cases for Kubernetes users and |
| 120 | + implementations, and document those. |
| 121 | +2. Determine a list of key features that Kubernetes needs to best support the |
| 122 | + defined use cases. |
| 123 | +3. For each feature in that list, make proposals which support them to the |
| 124 | + appropriate sub-projects OR propose new sub-projects if deemed necessary. |
| 125 | +4. Once the feature list is complete, leave behind some guidance and best |
| 126 | + practices for future implementations and then exit. |
0 commit comments