diff --git a/config/manifests/gateway/kubvernor/gateway.yaml b/config/manifests/gateway/kubvernor/gateway.yaml new file mode 100644 index 000000000..b37277d80 --- /dev/null +++ b/config/manifests/gateway/kubvernor/gateway.yaml @@ -0,0 +1,10 @@ +apiVersion: gateway.networking.k8s.io/v1 +kind: Gateway +metadata: + name: kubvernor-inference-gateway +spec: + gatewayClassName: kubvernor-inference-gateway + listeners: + - name: http + port: 80 + protocol: HTTP diff --git a/config/manifests/gateway/kubvernor/httproute.yaml b/config/manifests/gateway/kubvernor/httproute.yaml new file mode 100644 index 000000000..ef358f932 --- /dev/null +++ b/config/manifests/gateway/kubvernor/httproute.yaml @@ -0,0 +1,20 @@ +apiVersion: gateway.networking.k8s.io/v1 +kind: HTTPRoute +metadata: + name: llm-route +spec: + parentRefs: + - group: gateway.networking.k8s.io + kind: Gateway + name: kubvernor-inference-gateway + rules: + - backendRefs: + - group: inference.networking.x-k8s.io + kind: InferencePool + name: vllm-llama3-8b-instruct + matches: + - path: + type: PathPrefix + value: / + timeouts: + request: 300s diff --git a/site-src/guides/index.md b/site-src/guides/index.md index a1b10ed85..d7c9d30e4 100644 --- a/site-src/guides/index.md +++ b/site-src/guides/index.md @@ -244,6 +244,42 @@ This quickstart guide is intended for engineers familiar with k8s and model serv kubectl get httproute llm-route -o yaml ``` +=== "Kubvernor Rust API Gateway" + + [Kubvernor Rust API Gateway](https://github.com/kubvernor/kubvernor) is a higly experimental project so not ready for production but it supports version v0.5.1 of Inference Extension Spec. + + 1. Requirements + - Rust and Cargo installed + + 2. Run Kubvernor Rust API Gateway as documented in [README](https://github.com/kubvernor/kubvernor/blob/main/README.md) + + + 3. Deploy the Gateway + + ```bash + kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kubvernor/gateway.yaml + ``` + + Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status: + ```bash + $ kubectl get gateway kubvernor-inference-gateway + NAME CLASS ADDRESS PROGRAMMED AGE + kubvernor-inference-gateway kubvernor-inference-gateway True 22s + ``` + + 5. Deploy the HTTPRoute + + ```bash + kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kubvernor/httproute.yaml + ``` + + 6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: + + ```bash + kubectl get httproute llm-route -o yaml + ``` + + ### Try it out Wait until the gateway is ready. diff --git a/site-src/implementations/gateways.md b/site-src/implementations/gateways.md index 950c0833e..466445024 100644 --- a/site-src/implementations/gateways.md +++ b/site-src/implementations/gateways.md @@ -7,12 +7,14 @@ This project has several implementations that are planned or in progress: * [Google Kubernetes Engine][3] * [Istio][4] * [Alibaba Cloud Container Service for Kubernetes][5] +* [Kubvernor Rust API Gateway][6] [1]:#envoy-gateway [2]:#kgateway [3]:#google-kubernetes-engine [4]:#istio [5]:#alibaba-cloud-container-service-for-kubernetes +[6]:#kubernor-api-gateway ## Envoy AI Gateway @@ -85,4 +87,11 @@ by [this Issue](https://github.com/AliyunContainerService/ack-gateway-api/issues [ack]:https://www.alibabacloud.com/help/en/ack [ack-gie]:https://www.alibabacloud.com/help/en/ack/product-overview/ack-gateway-with-inference-extension -[ack-gie-usage]:https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/intelligent-routing-and-traffic-management-with-ack-gateway-inference-extension \ No newline at end of file +[ack-gie-usage]:https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/intelligent-routing-and-traffic-management-with-ack-gateway-inference-extension + +## Kubvernor Rust API Gateway +[Kubvernor Rust API Gateway][krg] is an open-source, highly experimental implementation of API controller in Rust programming language. Currently, Kubernor supports Envoy Proxy. The project aims to be as generic as possible so Kubvernor can be used to manage/deploy different gateways (Envoy, Nginx, HAProxy, etc.). Kubvernor Rust API Gateway implements Inference Extensions v0.5.1. + +[krg]:https://github.com/kubvernor/kubvernor + +