|
| 1 | +# Egress Gateways |
| 2 | + |
| 3 | +* Authors: @shaneutt |
| 4 | +* Status: Proposed |
| 5 | + |
| 6 | +# What? |
| 7 | + |
| 8 | +Provide standards in Kubernetes to route traffic outside of the cluster. |
| 9 | + |
| 10 | +# Why? |
| 11 | + |
| 12 | +Applications are increasingly utilizing inference as a part of their logic. |
| 13 | +This may be for chatbots, knowledgebases, or a variety of other potential use |
| 14 | +cases. The inference workloads to support this may not always be on the same |
| 15 | +cluster as the requesting workload. Inference workloads may be on a separate |
| 16 | +cluster, as the organization centralizes them, or they may be located |
| 17 | +specifically for reasons of regionality. Even then, not all organizations are |
| 18 | +going to run inference workloads themselves, and will utilize 3rd party cloud |
| 19 | +services. All of this points to a need to provide standards for how Kubernetes |
| 20 | +workloads reach these external inference sources, and provide the same AI |
| 21 | +Gateway security, control and management capabilities that are required for the |
| 22 | +ingress use case. |
| 23 | + |
| 24 | +## User Stories |
| 25 | + |
| 26 | +* As a cluster admin I need to provide inference to workloads on my cluster, |
| 27 | + but I provide a dedicated cluster for this so that I can manage it |
| 28 | + separately. |
| 29 | + |
| 30 | +* As a cluster admin I need to provide inference to workloads on my cluster, |
| 31 | + but I do not run AI workloads on Kubernetes. I use a cloud service to run |
| 32 | + models (e.g. Vertex, Bedrock) and need workloads to have managed access to |
| 33 | + that service to perform inference. |
| 34 | + |
| 35 | +* As a gateway admin I need to manage access tokens for 3rd party AI services |
| 36 | + so that workloads on the cluster can perform inference within needing to |
| 37 | + manage these secrets themselves, and so that I can manage access from all |
| 38 | + workloads in a uniform manner. |
| 39 | + |
| 40 | +* As a developer of an application that requires inference as part of its |
| 41 | + function, I need my application to have access to external AI cloud services |
| 42 | + which offer specific, specialized features only offered by that provider. |
| 43 | + |
| 44 | +* As a developer of an application that requires inference as part of its |
| 45 | + function, I need fail-over to 3rd party providers if local AI workloads are |
| 46 | + overwhelmed or in a failure state. |
| 47 | + |
| 48 | +## Goals |
| 49 | + |
| 50 | +* Define the standards for Gateways that route and manage traffic destined for |
| 51 | + external resources outside of the cluster. |
| 52 | +* Define (or refine) the standards by which token management for Gateways can |
| 53 | + be employed to enable access to backends that require auth. |
| 54 | + |
| 55 | +# How? |
| 56 | + |
| 57 | +TODO: in later PRs. |
| 58 | + |
| 59 | +> **This should be left blank until the "What?" and "Why?" are agreed upon, |
| 60 | +> as defining "How?" the goals are accomplished is not important unless we can |
| 61 | +> first even agree on what the problem is, and why we want to solve it. |
| 62 | +> |
| 63 | +> This section is fairly freeform, because (again) these proposals will |
| 64 | +> eventually find there way into any number of different final proposal formats |
| 65 | +> in other projects. However, the general guidance is to break things down into |
| 66 | +> highly focused sections as much as possible to help make things easier to |
| 67 | +> read and review. Long, unbroken walls of code and YAML in this document are |
| 68 | +> not advisable as that may increase the time it takes to review. |
| 69 | +
|
| 70 | +# Relevant Links |
| 71 | + |
| 72 | +* [Istio's implementation of Egress Gateways](https://istio.io/latest/docs/tasks/traffic-management/egress/egress-gateway/) |
0 commit comments