docs: add initial egress gateway proposal

shaneutt · shaneutt · commit 98f5cf85dd8e · 2025-10-15T09:41:15.000-04:00
Signed-off-by: Shane Utt &lt;shaneutt@linux.com&gt;
diff --git a/proposals/10-egress-gateways.md b/proposals/10-egress-gateways.md
@@ -0,0 +1,72 @@
+# Egress Gateways
+
+* Authors: @shaneutt
+* Status: Proposed
+
+# What?
+
+Provide standards in Kubernetes to route traffic outside of the cluster.
+
+# Why?
+
+Applications are increasingly utilizing inference as a part of their logic.
+This may be for chatbots, knowledgebases, or a variety of other potential use
+cases. The inference workloads to support this may not always be on the same
+cluster as the requesting workload. Inference workloads may be on a separate
+cluster, as the organization centralizes them, or they may be located
+specifically for reasons of regionality. Even then, not all organizations are
+going to run inference workloads themselves, and will utilize 3rd party cloud
+services. All of this points to a need to provide standards for how Kubernetes
+workloads reach these external inference sources, and provide the same AI
+Gateway security, control and management capabilities that are required for the
+ingress use case.
+
+## User Stories
+
+* As a cluster admin I need to provide inference to workloads on my cluster,
+  but I provide a dedicated cluster for this so that I can manage it
+  separately.
+
+* As a cluster admin I need to provide inference to workloads on my cluster,
+  but I do not run AI workloads on Kubernetes. I use a cloud service to run
+  models (e.g. Vertex, Bedrock) and need workloads to have managed access to
+  that service to perform inference.
+
+* As a gateway admin I need to manage access tokens for 3rd party AI services
+  so that workloads on the cluster can perform inference within needing to
+  manage these secrets themselves, and so that I can manage access from all
+  workloads in a uniform manner.
+
+* As a developer of an application that requires inference as part of its
+  function, I need my application to have access to external AI cloud services
+  which offer specific, specialized features only offered by that provider.
+
+* As a developer of an application that requires inference as part of its
+  function, I need fail-over to 3rd party providers if local AI workloads are
+  overwhelmed or in a failure state.
+
+## Goals
+
+* Define the standards for Gateways that route and manage traffic destined for
+  external resources outside of the cluster.
+* Define (or refine) the standards by which token management for Gateways can
+  be employed to enable access to backends that require auth.
+
+# How?
+
+TODO: in later PRs.
+
+> **This should be left blank until the "What?" and "Why?" are agreed upon,
+> as defining "How?" the goals are accomplished is not important unless we can
+> first even agree on what the problem is, and why we want to solve it.
+>
+> This section is fairly freeform, because (again) these proposals will
+> eventually find there way into any number of different final proposal formats
+> in other projects. However, the general guidance is to break things down into
+> highly focused sections as much as possible to help make things easier to
+> read and review. Long, unbroken walls of code and YAML in this document are
+> not advisable as that may increase the time it takes to review.
+
+# Relevant Links
+
+* [Istio's implementation of Egress Gateways](https://istio.io/latest/docs/tasks/traffic-management/egress/egress-gateway/)