Skip to content

Commit 5aca2ea

Browse files
committed
Update guide to add steps to deploy healthcheck policy for gke
1 parent 3846265 commit 5aca2ea

File tree

1 file changed

+27
-16
lines changed

1 file changed

+27
-16
lines changed

site-src/guides/index.md

Lines changed: 27 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,21 @@ A cluster with:
8080
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
8181
```
8282

83+
### Deploy the InferencePool and Endpoint Picker Extension
84+
85+
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
86+
87+
```bash
88+
export GATEWAY_PROVIDER=none # See [README](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/charts/inferencepool/README.md#configuration) for valid configurations
89+
helm install vllm-llama3-8b-instruct \
90+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
91+
--set provider.name=$GATEWAY_PROVIDER \
92+
--version v0.3.0 \
93+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
94+
```
95+
96+
The Helm install automatically installs the endpoint-picker, inferencepool along with provider specific resources.
97+
8398
### Deploy an Inference Gateway
8499

85100
Choose one of the following options to deploy an Inference Gateway.
@@ -113,6 +128,18 @@ A cluster with:
113128
```bash
114129
kubectl get httproute llm-route -o yaml
115130
```
131+
132+
5. Deploy the HealthCheckPolicy
133+
134+
```bash
135+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/healthcheck.yaml
136+
```
137+
138+
6. Confirm that the HealthCheckPolicy status conditions include `Attached=True`:
139+
140+
```bash
141+
kubectl get healthcheckpolicy health-check-policy -o yaml
142+
```
116143

117144
=== "Istio"
118145

@@ -267,22 +294,6 @@ A cluster with:
267294
kubectl get httproute llm-route -o yaml
268295
```
269296

270-
271-
### Deploy the InferencePool and Endpoint Picker Extension
272-
273-
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
274-
275-
```bash
276-
export GATEWAY_PROVIDER=none # See [README](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/charts/inferencepool/README.md#configuration) for valid configurations
277-
helm install vllm-llama3-8b-instruct \
278-
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
279-
--set provider.name=$GATEWAY_PROVIDER \
280-
--version v0.3.0 \
281-
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
282-
```
283-
284-
The Helm install automatically installs the endpoint-picker, inferencepool along with provider specific resources.
285-
286297
### Deploy InferenceObjective (Optional)
287298

288299
Deploy the sample InferenceObjective which allows you to specify priority of requests.

0 commit comments

Comments
 (0)