You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/proposals/1374-mc-inference-gateways/README.md
+17-16Lines changed: 17 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,14 +88,14 @@ In the future, a more advanced implementation could allow Endpoint Pickers to p
88
88
89
89
**Pros**:
90
90
91
-
- Reuses existing MCS model
92
-
- Simplest possible API model
93
-
- “Export” configuration lives on InferencePool and clearly applies to the entire pool, not just EPP
94
-
- Can clearly reference an InferencePool in other clusters without having one locally
91
+
* Reuses existing MCS model
92
+
* Simplest possible API model
93
+
* “Export” configuration lives on InferencePool and clearly applies to the entire pool, not just EPP
94
+
* Can clearly reference an InferencePool in other clusters without having one locally
95
95
96
96
**Cons**:
97
97
98
-
- Does not reuse MCS API (unclear if this is a con)
98
+
* Does not reuse MCS API (unclear if this is a con)
99
99
100
100
## Alternative 1: MCS API for EPP
101
101
@@ -105,15 +105,16 @@ If we lean into the idea that the only thing a Gateway needs to know is the Endp
105
105
106
106
**Pros**:
107
107
108
-
- Reuses existing MCS infrastructure.
109
-
- Likely relatively simple to implement.
108
+
* Reuses existing MCS infrastructure.
109
+
* Likely relatively simple to implement.
110
110
111
111
**Cons**:
112
112
113
-
- Referencing InferencePools in other clusters requires you to create an InferencePool locally.
114
-
- Significantly more complex configuration (more YAML at least).
115
-
- "FailOpen" mode becomes ~impossible if implementations don't actually have some model server endpoints to fall back to.
116
-
- In this model, you don’t actually choose to export an InferencePool, you export the Endpoint Picker, that could lead to significant confusion.
113
+
* Referencing InferencePools in other clusters requires you to create an InferencePool locally.
114
+
* Significantly more complex configuration (more YAML at least).
115
+
* "FailOpen" mode becomes ~impossible if implementations don't actually have some model server endpoints to fall back to.
116
+
* In this model, you don’t actually choose to export an InferencePool, you export the Endpoint Picker, that could lead to significant confusion.
117
+
* InferencePool is meant to be a replacement for a Service so it may seem counterintuitive for a user to create a Service to achieve multi-cluster inference.
117
118
118
119
## Alternative 2: New MCS API
119
120
@@ -155,11 +156,11 @@ Can we find a way to configure preferences for where a request should be routed?
155
156
156
157
### Prior Art
157
158
158
-
- [GEP-1748: Gateway API Interaction with Multi-Cluster Services](https://gateway-api.sigs.k8s.io/geps/gep-1748/)
159
-
- [Envoy Gateway with Multi-Cluster Services](https://gateway.envoyproxy.io/latest/tasks/traffic/multicluster-service/)
160
-
- [Multicluster Service API](https://multicluster.sigs.k8s.io/concepts/multicluster-services-api/)
161
-
- [Submariner](https://submariner.io/)
159
+
* [GEP-1748: Gateway API Interaction with Multi-Cluster Services](https://gateway-api.sigs.k8s.io/geps/gep-1748/)
160
+
* [Envoy Gateway with Multi-Cluster Services](https://gateway.envoyproxy.io/latest/tasks/traffic/multicluster-service/)
161
+
* [Multicluster Service API](https://multicluster.sigs.k8s.io/concepts/multicluster-services-api/)
162
+
* [Submariner](https://submariner.io/)
162
163
163
164
### References
164
165
165
-
- [Original Doc for MultiCluster Inference Gateway](https://docs.google.com/document/d/1QGvG9ToaJ72vlCBdJe--hmrmLtgOV_ptJi9D58QMD2w/edit?tab=t.0#heading=h.q6xiq2fzcaia)
166
+
* [Original Doc for MultiCluster Inference Gateway](https://docs.google.com/document/d/1QGvG9ToaJ72vlCBdJe--hmrmLtgOV_ptJi9D58QMD2w/edit?tab=t.0#heading=h.q6xiq2fzcaia)
0 commit comments