Skip to content

Commit 095ff1a

Browse files
committed
Formatting updates.
1 parent d2e274f commit 095ff1a

File tree

1 file changed

+5
-5
lines changed
  • docs/proposals/1374-mc-inference-gateways

1 file changed

+5
-5
lines changed

docs/proposals/1374-mc-inference-gateways/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Author(s): @robscott, @bexxmodd
88

99
## Summary
1010

11-
Inference Gateways aim to provide efficient routing to LLM workloads running in Kubernetes. In practice, an Inference Gateway is a Gateway that conforms to the [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/). This Gateway supports a new type of backend - InferencePool. When routing to an [InferencePool](https://gateway-api-inference-extension.sigs.k8s.io/api-types/inferencepool/), the Gateway calls out to an “Endpoint Picker” referenced by the InferencePool to get instructions on which specific endpoint within the pool it should reference.
11+
Inference Gateways aim to provide efficient routing to LLM workloads running in Kubernetes. In practice, an Inference Gateway is a Gateway that conforms to the [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/concepts/conformance/). This Gateway supports a new type of backend - InferencePool. When routing to an [InferencePool](https://gateway-api-inference-extension.sigs.k8s.io/api-types/inferencepool/), the Gateway calls out to an “Endpoint Picker” referenced by the InferencePool to get instructions on which specific endpoint within the pool it should route the request to.
1212

1313
![Inference Architecture](images/gw-epp-ip.png)
1414

@@ -18,13 +18,13 @@ Until now, Inference Gateways have been focused exclusively on routing to a sing
1818

1919
### Goals
2020

21-
Enable Inference Gateways to route to backends in multiple clusters
22-
Follow a pattern that is familiar to users of Multi-Cluster Services (MCS) and/or Gateways
21+
* Enable Inference Gateways to route to backends in multiple clusters.
22+
* Follow a pattern that is familiar to users of [Multi-Cluster Services (MCS)](https://multicluster.sigs.k8s.io/concepts/multicluster-services-api/) and/or Gateways.
2323

2424
### Non-Goals
2525

26-
Be overly prescriptive about implementation details - this should focus on the resulting UX and leave significant flexibility in how it is achieved
27-
L4 ClusterIP routing and/or automatic DNS naming - all traffic needs to flow through the Inference Gateway for this pattern to be useful (otherwise the Endpoint Picker itself would be bypassed)
26+
* Be overly prescriptive about implementation details - this should focus on the resulting UX and leave significant flexibility in how it is achieved.
27+
* L4 ClusterIP routing and/or automatic DNS naming - all traffic needs to flow through the Inference Gateway for this pattern to be useful (otherwise the Endpoint Picker itself would be bypassed).
2828

2929
## Proposal
3030

0 commit comments

Comments
 (0)