Add agentgateway as implementation (#1321)

howardjohn · web-flow · commit cc9d7711f266 · 2025-08-13T07:25:08.000-07:00
* Add agentgateway as implementation

* Add conformance report passing all tests with v0.5.1
* Add to implementation list. In order to avoid picking where in the
  list to add it, I sorted the list alphabetically mirroring GW API
* Update a reference to our ext_proc server implementation. I don't care
  to update this for every release or anything, but the old link was to
a very very rough implementation that has since had many fixes.

* address comments
diff --git a/README.md b/README.md
@@ -60,7 +60,7 @@ For deeper insights and more advanced concepts, refer to our [proposals](/docs/p
 
 ## Technical Overview
 
-This extension upgrades an [ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter) capable proxy or gateway - such as Envoy Gateway, kGateway, or the GKE Gateway - to become an **[inference gateway]** - supporting inference platform teams self-hosting Generative Models (with a current focus on large language models) on Kubernetes. This integration makes it easy to expose and control access to your local [OpenAI-compatible chat completion endpoints](https://platform.openai.com/docs/api-reference/chat) to other workloads on or off cluster, or to integrate your self-hosted models alongside model-as-a-service providers in a higher level **AI Gateway** like LiteLLM, Solo AI Gateway, or Apigee.
+This extension upgrades an [ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter) capable proxy or gateway - such as Envoy Gateway, kgateway, or the GKE Gateway - to become an **[inference gateway]** - supporting inference platform teams self-hosting Generative Models (with a current focus on large language models) on Kubernetes. This integration makes it easy to expose and control access to your local [OpenAI-compatible chat completion endpoints](https://platform.openai.com/docs/api-reference/chat) to other workloads on or off cluster, or to integrate your self-hosted models alongside model-as-a-service providers in a higher level **AI Gateway** like LiteLLM, Solo AI Gateway, or Apigee.
 
 The Inference Gateway:
 
diff --git a/config/manifests/gateway/agentgateway/gateway.yaml b/config/manifests/gateway/agentgateway/gateway.yaml
@@ -0,0 +1,10 @@
+apiVersion: gateway.networking.k8s.io/v1
+kind: Gateway
+metadata:
+  name: inference-gateway
+spec:
+  gatewayClassName: agentgateway
+  listeners:
+  - name: http
+    port: 80
+    protocol: HTTP
diff --git a/config/manifests/gateway/agentgateway/httproute.yaml b/config/manifests/gateway/agentgateway/httproute.yaml
@@ -0,0 +1,20 @@
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  name: llm-route
+spec:
+  parentRefs:
+  - group: gateway.networking.k8s.io
+    kind: Gateway
+    name: inference-gateway
+  rules:
+  - backendRefs:
+    - group: inference.networking.x-k8s.io
+      kind: InferencePool
+      name: vllm-llama3-8b-instruct
+    matches:
+    - path:
+        type: PathPrefix
+        value: /
+    timeouts:
+      request: 300s
diff --git a/conformance/reports/v0.5.1/gateway/agentgateway/README.md b/conformance/reports/v0.5.1/gateway/agentgateway/README.md
@@ -0,0 +1,11 @@
+# Agentgateway (with kgateway)
+
+## Table of Contents
+
+| Extension Version Tested | Profile Tested | Implementation Version | Mode    | Report                                                                     |
+|--------------------------|----------------|------------------------|---------|----------------------------------------------------------------------------|
+| v0.5.1                   | Gateway        | v0.7.2                 | default | [v0.7.2 report](./inference-v0.7.2-report.yaml)   |
+
+## Reproduce
+
+From the [kgateway repository](https://github.com/kgateway-dev/kgateway/): `CONFORMANCE_GATEWAY_CLASS=agentgateway make gie-conformance`.
diff --git a/conformance/reports/v0.5.1/gateway/agentgateway/inference-v0.7.2-report.yaml b/conformance/reports/v0.5.1/gateway/agentgateway/inference-v0.7.2-report.yaml
@@ -0,0 +1,23 @@
+GatewayAPIInferenceExtensionVersion: v0.5.1
+apiVersion: gateway.networking.k8s.io/v1
+date: "2025-08-06T17:50:20-07:00"
+gatewayAPIChannel: experimental
+gatewayAPIVersion: v1.3.0
+implementation:
+  contact:
+  - github.com/agentgateway/agentgateway/issues/new/choose
+  organization: agentgateway
+  project: agentgateway
+  url: http://agentgateway.dev/
+  version: v0.7.2
+kind: ConformanceReport
+mode: default
+profiles:
+- core:
+    result: success
+    statistics:
+      Failed: 0
+      Passed: 9
+      Skipped: 0
+  name: Gateway
+  summary: Core tests succeeded.
diff --git a/site-src/guides/implementers.md b/site-src/guides/implementers.md
@@ -141,7 +141,7 @@ Supporting this broad range of extension capabilities (including for inference,
 Several implementations can be used as references:
 
 - A fully featured [reference implementation](https://github.com/envoyproxy/envoy/tree/main/source/extensions/filters/http/ext_proc) (C++) can be found in the Envoy GitHub repository.
-- A second implementation (Rust, non-Envoy) is available in [Agent Gateway](https://github.com/agentgateway/agentgateway/blob/v0.5.2/crates/proxy/src/ext_proc.rs).
+- A second implementation (Rust, non-Envoy) is available in [agentgateway](https://github.com/agentgateway/agentgateway/blob/v0.7.2/crates/agentgateway/src/http/ext_proc.rs).
 
 #### Portable Implementation
 
diff --git a/site-src/guides/index.md b/site-src/guides/index.md
@@ -244,6 +244,53 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
          kubectl get httproute llm-route -o yaml
          ```
 
+=== "Agentgateway"
+
+      [Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for inference routing. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane.
+
+      1. Requirements
+
+         - [Helm](https://helm.sh/docs/intro/install/) installed.
+         - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
+
+      2. Set the Kgateway version and install the Kgateway CRDs.
+
+         ```bash
+         KGTW_VERSION=v2.0.4
+         helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
+         ```
+
+      3. Install Kgateway
+
+         ```bash
+         helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true
+         ```
+
+      4. Deploy the Gateway
+
+         ```bash
+         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
+         ```
+
+         Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
+         ```bash
+         $ kubectl get gateway inference-gateway
+         NAME                CLASS               ADDRESS         PROGRAMMED   AGE
+         inference-gateway   agentgateway        <MY_ADDRESS>    True         22s
+         ```
+
+      5. Deploy the HTTPRoute
+
+         ```bash
+         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
+         ```
+
+      6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
+
+         ```bash
+         kubectl get httproute llm-route -o yaml
+         ```
+
 ### Try it out
 
    Wait until the gateway is ready.
@@ -339,3 +386,25 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
          ```bash
          kubectl delete ns kgateway-system
          ```
+
+=== "Agentgateway"
+
+      The following instructions assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
+
+      1. Uninstall Kgateway
+
+         ```bash
+         helm uninstall kgateway -n kgateway-system
+         ```
+
+      1. Uninstall the Kgateway CRDs.
+
+         ```bash
+         helm uninstall kgateway-crds -n kgateway-system
+         ```
+
+      1. Remove the Kgateway namespace.
+
+         ```bash
+         kubectl delete ns kgateway-system
+         ```
diff --git a/site-src/implementations/gateways.md b/site-src/implementations/gateways.md
@@ -2,17 +2,45 @@
 
 This project has several implementations that are planned or in progress:
 
-* [Envoy AI Gateway][1]
-* [Kgateway][2]
-* [Google Kubernetes Engine][3]
-* [Istio][4]
-* [Alibaba Cloud Container Service for Kubernetes][5]
-
-[1]:#envoy-gateway
-[2]:#kgateway
-[3]:#google-kubernetes-engine
-[4]:#istio
-[5]:#alibaba-cloud-container-service-for-kubernetes
+* [Agentgateway][1]
+* [Alibaba Cloud Container Service for Kubernetes][2]
+* [Envoy AI Gateway][3]
+* [Google Kubernetes Engine][4]
+* [Istio][5]
+* [Kgateway][6]
+
+[1]:#agentgateway
+[2]:#alibaba-cloud-container-service-for-kubernetes
+[3]:#envoy-ai-gateway
+[4]:#google-kubernetes-engine
+[5]:#istio
+[6]:#kgateway
+
+## Agentgateway
+
+[Agentgateway](https://agentgateway.dev/) is an open source Gateway API implementation focusing on AI use cases, including LLM consumption, LLM serving, agent-to-agent ([A2A](https://a2aproject.github.io/A2A/latest/)), and agent-to-tool ([MCP](https://modelcontextprotocol.io/introduction)). It is the first and only proxy designed specifically for the Kubernetes Gateway API, powered by a high performance and scalable Rust dataplane implementation.
+
+Agentgateway comes with native support for Gateway API Inference Extension, powered by the [Kgateway](https://kgateway.dev/) control plane.
+
+## Alibaba Cloud Container Service for Kubernetes
+
+[Alibaba Cloud Container Service for Kubernetes (ACK)][ack] is a managed Kubernetes platform 
+offered by Alibaba Cloud. The implementation of the Gateway API in ACK is through the 
+[ACK Gateway with Inference Extension][ack-gie] component, which introduces model-aware, 
+GPU-efficient load balancing for AI workloads beyond basic HTTP routing.
+
+The ACK Gateway with Inference Extension implements the Gateway API Inference Extension 
+and provides optimized routing for serving generative AI workloads, 
+including weighted traffic splitting, mirroring, advanced routing, etc. 
+See the docs for the [usage][ack-gie-usage].
+
+Progress towards supporting Gateway API Inference Extension is being tracked 
+by [this Issue](https://github.com/AliyunContainerService/ack-gateway-api/issues/1).
+
+[ack]:https://www.alibabacloud.com/help/en/ack
+[ack-gie]:https://www.alibabacloud.com/help/en/ack/product-overview/ack-gateway-with-inference-extension
+[ack-gie-usage]:https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/intelligent-routing-and-traffic-management-with-ack-gateway-inference-extension
+
 
 ## Envoy AI Gateway
 
@@ -29,14 +57,6 @@ Issue](https://github.com/envoyproxy/ai-gateway/issues/423).
 [aigw-capabilities]:https://aigateway.envoyproxy.io/docs/capabilities/
 [aigw-quickstart]:https://aigateway.envoyproxy.io/docs/capabilities/gateway-api-inference-extension
 
-## Kgateway
-
-[Kgateway](https://kgateway.dev/) is a Gateway API Inference Extension
-[conformant](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/conformance/reports/v0.5.1/gateway/kgateway)
-gateway that can run [independently](https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_3), as an [Istio waypoint](https://kgateway.dev/blog/extend-istio-ambient-kgateway-waypoint/),
-or within your [llm-d infrastructure](https://github.com/llm-d-incubation/llm-d-infra) to improve accelerator (GPU)
-utilization for AI inference workloads.
-
 ## Google Kubernetes Engine
 
 [Google Kubernetes Engine (GKE)][gke] is a managed Kubernetes platform offered
@@ -66,21 +86,10 @@ For service mesh users, Istio also fully supports east-west (including [GAMMA](h
 Gateway API Inference Extension support is being tracked by this [GitHub
 Issue](https://github.com/istio/istio/issues/55768).
 
-## Alibaba Cloud Container Service for Kubernetes
-
-[Alibaba Cloud Container Service for Kubernetes (ACK)][ack] is a managed Kubernetes platform 
-offered by Alibaba Cloud. The implementation of the Gateway API in ACK is through the 
-[ACK Gateway with Inference Extension][ack-gie] component, which introduces model-aware, 
-GPU-efficient load balancing for AI workloads beyond basic HTTP routing.
-
-The ACK Gateway with Inference Extension implements the Gateway API Inference Extension 
-and provides optimized routing for serving generative AI workloads, 
-including weighted traffic splitting, mirroring, advanced routing, etc. 
-See the docs for the [usage][ack-gie-usage].
-
-Progress towards supporting Gateway API Inference Extension is being tracked 
-by [this Issue](https://github.com/AliyunContainerService/ack-gateway-api/issues/1).
+## Kgateway
 
-[ack]:https://www.alibabacloud.com/help/en/ack
-[ack-gie]:https://www.alibabacloud.com/help/en/ack/product-overview/ack-gateway-with-inference-extension
-[ack-gie-usage]:https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/intelligent-routing-and-traffic-management-with-ack-gateway-inference-extension
+[Kgateway](https://kgateway.dev/) is a Gateway API Inference Extension
+[conformant](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/conformance/reports/v0.5.1/gateway/kgateway)
+gateway that can run [independently](https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_3), as an [Istio waypoint](https://kgateway.dev/blog/extend-istio-ambient-kgateway-waypoint/),
+or within your [llm-d infrastructure](https://github.com/llm-d-incubation/llm-d-infra) to improve accelerator (GPU)
+utilization for AI inference workloads.