You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -106,37 +110,31 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
106
110
1. Enable the Gateway API and configure proxy-only subnets when necessary. See [Deploy Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploying-gateways)
107
111
for detailed instructions.
108
112
109
-
1. Deploy Gateway and HealthCheckPolicy resources
113
+
2. Deploy Gateway and HealthCheckPolicy resources:
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
130
-
131
-
```bash
132
-
kubectl get httproute llm-route -o yaml
133
-
```
134
-
135
-
5. Given that the default connection timeout may be insufficient for most inference workloads, it is recommended to configure a timeout appropriate for your intended use case.
127
+
3. To install an InferencePool named vllm-llama3-8b-instruct that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
0 commit comments