diff --git a/site-src/guides/index.md b/site-src/guides/index.md index 393d0b816..2c025c940 100644 --- a/site-src/guides/index.md +++ b/site-src/guides/index.md @@ -16,25 +16,25 @@ ```bash kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to the set of Llama models - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/gpu-deployment.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/gpu-deployment.yaml ``` --8<-- "site-src/_includes/model-server-cpu.md" ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/cpu-deployment.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/cpu-deployment.yaml ``` --8<-- "site-src/_includes/model-server-sim.md" ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/sim-deployment.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/sim-deployment.yaml ``` ### Install the Inference Extension CRDs ```bash -kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/v1-manifests.yaml +kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.2/v1-manifests.yaml ``` ### Deploy the InferencePool and Endpoint Picker Extension @@ -44,7 +44,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens Set the chart version and then select a tab to follow the provider-specific instructions. ```bash - export IGW_CHART_VERSION=v1.0.1 + export IGW_CHART_VERSION=v1.0.2 ``` --8<-- "site-src/_includes/epp.md" @@ -62,7 +62,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens 2. Deploy Inference Gateway: ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/gateway.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/gateway.yaml ``` Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status: @@ -75,7 +75,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens 3. Deploy the HTTPRoute ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/httproute.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/httproute.yaml ``` 4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: @@ -167,7 +167,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens 4. Deploy the Gateway ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/gateway.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/gateway.yaml ``` Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status: @@ -180,7 +180,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens 5. Deploy the HTTPRoute ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/httproute.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/httproute.yaml ``` 6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: @@ -214,7 +214,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens 4. Deploy the Gateway ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/gateway.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/gateway.yaml ``` Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status: @@ -227,7 +227,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens 5. Deploy the HTTPRoute ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/httproute.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/httproute.yaml ``` 6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: @@ -241,7 +241,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens Deploy the sample InferenceObjective which allows you to specify priority of requests. ```bash - kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/inferenceobjective.yaml + kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/inferenceobjective.yaml ``` --8<-- "site-src/_includes/test.md" @@ -257,17 +257,17 @@ Deploy the sample InferenceObjective which allows you to specify priority of req ```bash helm uninstall vllm-llama3-8b-instruct - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/inferenceobjective.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/sim-deployment.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/inferenceobjective.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/sim-deployment.yaml --ignore-not-found kubectl delete secret hf-token --ignore-not-found ``` 1. Uninstall the Gateway API Inference Extension CRDs ```bash - kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/manifests.yaml --ignore-not-found + kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.2/manifests.yaml --ignore-not-found ``` 1. Choose one of the following options to cleanup the Inference Gateway. @@ -275,18 +275,18 @@ Deploy the sample InferenceObjective which allows you to specify priority of req === "GKE" ```bash - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/gateway.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/httproute.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/gateway.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/httproute.yaml --ignore-not-found ``` === "Istio" ```bash - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/istio/gateway.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/istio/destination-rule.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/istio/httproute.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/istio/gateway.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/istio/destination-rule.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/istio/httproute.yaml --ignore-not-found ``` The following steps assume you would like to clean up ALL Istio resources that were created in this quickstart guide. @@ -306,8 +306,8 @@ Deploy the sample InferenceObjective which allows you to specify priority of req === "Kgateway" ```bash - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found ``` The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide. @@ -333,8 +333,8 @@ Deploy the sample InferenceObjective which allows you to specify priority of req === "Agentgateway" ```bash - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found - kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found + kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found ``` The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.