From 6e543430ebb127e8e0f257d780405fa8e359f3cd Mon Sep 17 00:00:00 2001 From: Saylor Berman Date: Mon, 25 Aug 2025 11:09:26 -0600 Subject: [PATCH 1/2] Provisional: Proposal for Inference Extension Adding the provisional proposal for supporting the Gateway API Inference Extension. Also migrated some archived design docs. --- .../connect-response.png | Bin .../control-data-plane-separation/connect.png | Bin .../deployment-architecture.png | Bin .../control-data-plane-separation/design.md | 0 .../download-config.png | Bin docs/proposals/gateway-inference-extension.md | 19 ++++++++++++++++++ 6 files changed, 19 insertions(+) rename design/{ => archive}/control-data-plane-separation/connect-response.png (100%) rename design/{ => archive}/control-data-plane-separation/connect.png (100%) rename design/{ => archive}/control-data-plane-separation/deployment-architecture.png (100%) rename design/{ => archive}/control-data-plane-separation/design.md (100%) rename design/{ => archive}/control-data-plane-separation/download-config.png (100%) create mode 100644 docs/proposals/gateway-inference-extension.md diff --git a/design/control-data-plane-separation/connect-response.png b/design/archive/control-data-plane-separation/connect-response.png similarity index 100% rename from design/control-data-plane-separation/connect-response.png rename to design/archive/control-data-plane-separation/connect-response.png diff --git a/design/control-data-plane-separation/connect.png b/design/archive/control-data-plane-separation/connect.png similarity index 100% rename from design/control-data-plane-separation/connect.png rename to design/archive/control-data-plane-separation/connect.png diff --git a/design/control-data-plane-separation/deployment-architecture.png b/design/archive/control-data-plane-separation/deployment-architecture.png similarity index 100% rename from design/control-data-plane-separation/deployment-architecture.png rename to design/archive/control-data-plane-separation/deployment-architecture.png diff --git a/design/control-data-plane-separation/design.md b/design/archive/control-data-plane-separation/design.md similarity index 100% rename from design/control-data-plane-separation/design.md rename to design/archive/control-data-plane-separation/design.md diff --git a/design/control-data-plane-separation/download-config.png b/design/archive/control-data-plane-separation/download-config.png similarity index 100% rename from design/control-data-plane-separation/download-config.png rename to design/archive/control-data-plane-separation/download-config.png diff --git a/docs/proposals/gateway-inference-extension.md b/docs/proposals/gateway-inference-extension.md new file mode 100644 index 0000000000..b180038676 --- /dev/null +++ b/docs/proposals/gateway-inference-extension.md @@ -0,0 +1,19 @@ +# Enhancement Proposal-3716: Gateway API Inference Extension + +- Issue: https://github.com/nginx/nginx-gateway-fabric/issues/3716 +- Status: Provisional + +## Summary + +Enable NGINX Gateway Fabric to support the [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/), allowing dynamic routing to AI workloads. + +## Goals + +- Determine which resources (e.g. InferencePool) NGF needs to watch, and what configuration should be built based upon this. +- Define the process in which NGF should integrate with the [Endpoint Picker](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/pkg/epp) (EPP). +- Determine what NGINX needs to do in order to forward incoming traffic to an AI workload. + +## Non-Goals + +- Define new APIs. +- Determine how to integrate with AI Gateway (future). From 78d3cd287a0c14d445283c388e6b9a9551f8e6ff Mon Sep 17 00:00:00 2001 From: Saylor Berman Date: Mon, 25 Aug 2025 14:10:12 -0600 Subject: [PATCH 2/2] Add a bit more context --- docs/proposals/gateway-inference-extension.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/proposals/gateway-inference-extension.md b/docs/proposals/gateway-inference-extension.md index b180038676..d3869a97f8 100644 --- a/docs/proposals/gateway-inference-extension.md +++ b/docs/proposals/gateway-inference-extension.md @@ -5,10 +5,11 @@ ## Summary -Enable NGINX Gateway Fabric to support the [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/), allowing dynamic routing to AI workloads. +Enable NGINX Gateway Fabric to support the [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/), allowing dynamic routing to AI workloads. The goal for now is a basic implementation that meets the core functionality based on the API spec. There are likely many enhancements and improvements that can be made to this, but those should be considered after feedback around the usage and worth of this feature. ## Goals +- Define and implement the basic implementation to meet the API's core specifications. - Determine which resources (e.g. InferencePool) NGF needs to watch, and what configuration should be built based upon this. - Define the process in which NGF should integrate with the [Endpoint Picker](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/pkg/epp) (EPP). - Determine what NGINX needs to do in order to forward incoming traffic to an AI workload. @@ -16,4 +17,5 @@ Enable NGINX Gateway Fabric to support the [Gateway API Inference Extension](htt ## Non-Goals - Define new APIs. -- Determine how to integrate with AI Gateway (future). +- Determine how to integrate with AI Gateway. +- Any functionality beyond the core API specification.