Skip to content

Commit 7d184bb

Browse files
authored
Provisional: Proposal for Inference Extension (#3789)
Adding the provisional proposal for supporting the Gateway API Inference Extension. Also migrated some archived design docs.
1 parent ab9c023 commit 7d184bb

File tree

6 files changed

+21
-0
lines changed

6 files changed

+21
-0
lines changed
File renamed without changes.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Enhancement Proposal-3716: Gateway API Inference Extension
2+
3+
- Issue: https://github.com/nginx/nginx-gateway-fabric/issues/3716
4+
- Status: Provisional
5+
6+
## Summary
7+
8+
Enable NGINX Gateway Fabric to support the [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/), allowing dynamic routing to AI workloads. The goal for now is a basic implementation that meets the core functionality based on the API spec. There are likely many enhancements and improvements that can be made to this, but those should be considered after feedback around the usage and worth of this feature.
9+
10+
## Goals
11+
12+
- Define and implement the basic implementation to meet the API's core specifications.
13+
- Determine which resources (e.g. InferencePool) NGF needs to watch, and what configuration should be built based upon this.
14+
- Define the process in which NGF should integrate with the [Endpoint Picker](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/pkg/epp) (EPP).
15+
- Determine what NGINX needs to do in order to forward incoming traffic to an AI workload.
16+
17+
## Non-Goals
18+
19+
- Define new APIs.
20+
- Determine how to integrate with AI Gateway.
21+
- Any functionality beyond the core API specification.

0 commit comments

Comments
 (0)