Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 28 additions & 28 deletions site-src/guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,25 @@

```bash
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to the set of Llama models
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/gpu-deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/gpu-deployment.yaml
```

--8<-- "site-src/_includes/model-server-cpu.md"

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/cpu-deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/cpu-deployment.yaml
```

--8<-- "site-src/_includes/model-server-sim.md"

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/sim-deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/sim-deployment.yaml
```

### Install the Inference Extension CRDs

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/v1-manifests.yaml
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.2/v1-manifests.yaml
```

### Deploy the InferencePool and Endpoint Picker Extension
Expand All @@ -44,7 +44,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
Set the chart version and then select a tab to follow the provider-specific instructions.

```bash
export IGW_CHART_VERSION=v1.0.1
export IGW_CHART_VERSION=v1.0.2
```

--8<-- "site-src/_includes/epp.md"
Expand All @@ -62,7 +62,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
2. Deploy Inference Gateway:

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/gateway.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/gateway.yaml
```

Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
Expand All @@ -75,7 +75,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
3. Deploy the HTTPRoute

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/httproute.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/httproute.yaml
```

4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
Expand Down Expand Up @@ -167,7 +167,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
4. Deploy the Gateway

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/gateway.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/gateway.yaml
```

Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
Expand All @@ -180,7 +180,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
5. Deploy the HTTPRoute

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/httproute.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/httproute.yaml
```

6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
Expand Down Expand Up @@ -214,7 +214,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
4. Deploy the Gateway

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/gateway.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/gateway.yaml
```

Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
Expand All @@ -227,7 +227,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
5. Deploy the HTTPRoute

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/httproute.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/httproute.yaml
```

6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
Expand All @@ -241,7 +241,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
Deploy the sample InferenceObjective which allows you to specify priority of requests.

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/inferenceobjective.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/inferenceobjective.yaml
```

--8<-- "site-src/_includes/test.md"
Expand All @@ -257,36 +257,36 @@ Deploy the sample InferenceObjective which allows you to specify priority of req

```bash
helm uninstall vllm-llama3-8b-instruct
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/inferenceobjective.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/inferenceobjective.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
kubectl delete secret hf-token --ignore-not-found
```

1. Uninstall the Gateway API Inference Extension CRDs

```bash
kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/manifests.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.2/manifests.yaml --ignore-not-found
```

1. Choose one of the following options to cleanup the Inference Gateway.

=== "GKE"

```bash
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/gke/httproute.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/gke/httproute.yaml --ignore-not-found
```

=== "Istio"

```bash
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/istio/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/istio/destination-rule.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/istio/httproute.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/istio/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/istio/destination-rule.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/istio/httproute.yaml --ignore-not-found
```

The following steps assume you would like to clean up ALL Istio resources that were created in this quickstart guide.
Expand All @@ -306,8 +306,8 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
=== "Kgateway"

```bash
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found
```

The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
Expand All @@ -333,8 +333,8 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
=== "Agentgateway"

```bash
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.1/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found
```

The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
Expand Down