Skip to content

Commit 319a488

Browse files
committed
docs: add initial egress gateway proposal
Signed-off-by: Shane Utt <[email protected]>
1 parent 08965ed commit 319a488

File tree

1 file changed

+107
-0
lines changed

1 file changed

+107
-0
lines changed

proposals/10-egress-gateways.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Egress Gateways
2+
3+
* Authors: @shaneutt
4+
* Status: Proposed
5+
6+
# What?
7+
8+
Provide standards in Kubernetes to route traffic outside of the cluster.
9+
10+
# Why?
11+
12+
Applications are increasingly utilizing inference as a part of their logic.
13+
This may be for chatbots, knowledgebases, or a variety of other potential use
14+
cases. The inference workloads to support this may not always be on the same
15+
cluster as the requesting workload. Inference workloads may be on a separate
16+
cluster, as the organization centralizes them, or they may be located
17+
specifically for reasons of regionality. Even then, not all organizations are
18+
going to run inference workloads themselves, and will utilize 3rd party cloud
19+
services. All of this points to a need to provide standards for how Kubernetes
20+
workloads reach these external inference sources, and provide the same AI
21+
Gateway security, control and management capabilities that are required for the
22+
ingress use case.
23+
24+
## User Stories
25+
26+
* As a gateway admin I need to provide workloads within my cluster access to
27+
services outside of my cluster, in particular cloud and otherwise hosted
28+
services.
29+
30+
* As a gateway admin I need to manage access tokens for 3rd party AI services
31+
so that workloads on the cluster can perform inference within needing to
32+
manage these secrets themselves, and so that I can manage access from all
33+
workloads in a uniform manner.
34+
35+
* As a gateway admin providing access and token management for 3rd party AI
36+
cloud services to workloads, I need fail-over from one cloud provider to
37+
others when the primary cloud provider is overwhelmed or in a failure state.
38+
39+
* As a cluster admin I need to provide inference to workloads on my cluster,
40+
but I provide a dedicated cluster for this so that I can manage it
41+
separately.
42+
43+
* As a cluster admin I need to provide inference to workloads on my cluster,
44+
but I do not run AI workloads on Kubernetes. I use a cloud service to run
45+
models (e.g. Vertex, Bedrock) and need workloads to have managed access to
46+
that service to perform inference.
47+
48+
* As a developer of an application that requires inference as part of its
49+
function, I need my application to have access to external AI cloud services
50+
which offer specific, specialized features only offered by that provider.
51+
52+
* As a developer of an application that requires inference as part of its
53+
function, I need fail-over to 3rd party providers if local AI workloads are
54+
overwhelmed or in a failure state.
55+
56+
* As a platform operator I need to attribute outbound traffic per namespace or
57+
workload to enforce rate or API utilization limits.
58+
59+
* As a compliance engineer I need to guarantee that outbound traffic to
60+
third-party AI resources obeys regulatory restrictions such as region locks.
61+
62+
## Goals
63+
64+
* Define the standards for Gateways that route and manage traffic destined for
65+
external resources outside of the cluster.
66+
* Define (or refine) the standards by which token management for Gateways can
67+
be employed to enable access to backends that require auth.
68+
* Foundationally the standards for egress Gateways should be based on standards
69+
based networking first, layering up to inference and agentic use cases.
70+
71+
# How?
72+
73+
TODO: in later PRs.
74+
75+
76+
> **This should be left blank until the "What?" and "Why?" are agreed upon,
77+
> as defining "How?" the goals are accomplished is not important unless we can
78+
> first even agree on what the problem is, and why we want to solve it.
79+
>
80+
> This section is fairly freeform, because (again) these proposals will
81+
> eventually find there way into any number of different final proposal formats
82+
> in other projects. However, the general guidance is to break things down into
83+
> highly focused sections as much as possible to help make things easier to
84+
> read and review. Long, unbroken walls of code and YAML in this document are
85+
> not advisable as that may increase the time it takes to review.
86+
87+
# Additional Criteria
88+
89+
The following are things we need to resolve before we can consider this
90+
proposal complete and ready to move out to other areas.
91+
92+
- [ ] We need to decide how the multi-cluster aspect of egress gateways
93+
interacts with the [GIE's multi-cluster proposal], if at all. This may end up
94+
with multiple different multi-cluster options for users, so we'll need to be
95+
clear about why there are multiple options, and what one solves over the
96+
other. SIG MC needs to be a part of this conversation.
97+
- [ ] The Agentic Networking Subproject has a [proposal for external MCP/A2A]
98+
services, making them a stakeholder for egress gateways as well. We need to
99+
work with them to incorporate their user stories and requirements so that
100+
what we ultimately ship covers the combined use cases.
101+
102+
[GIE's multi-cluster proposal]:https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/1374-multi-cluster-inference
103+
[proposal for external MCP/A2A]:https://docs.google.com/document/d/17kA-78gq25BgS2ElHMCd-zy__9clVL-GZQcHCm52854/edit?tab=t.0
104+
105+
# Relevant Links
106+
107+
* [Istio's implementation of Egress Gateways](https://istio.io/latest/docs/tasks/traffic-management/egress/egress-gateway/)

0 commit comments

Comments
 (0)