Skip to content

Commit ade46bd

Browse files
committed
Address review comments and restructure docs
1 parent 84516bd commit ade46bd

File tree

1 file changed

+33
-1
lines changed

1 file changed

+33
-1
lines changed

site-src/guides/index.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ A cluster with:
7575
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
7676
```
7777

78+
<<<<<<< HEAD
7879
### Deploy InferenceModel
7980

8081
Deploy the sample InferenceModel which is configured to forward traffic to the `food-review-1` [LoRA adapter](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
@@ -93,6 +94,8 @@ A cluster with:
9394
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
9495
```
9596

97+
=======
98+
>>>>>>> 8e5ae69 (Address review comments and restructure docs)
9699
### Deploy an Inference Gateway
97100

98101
Choose one of the following options to deploy an Inference Gateway.
@@ -115,8 +118,19 @@ A cluster with:
115118
NAME CLASS ADDRESS PROGRAMMED AGE
116119
inference-gateway inference-gateway <MY_ADDRESS> True 22s
117120
```
121+
3. Deploy the HTTPRoute
118122

119-
3. To install an InferencePool named vllm-llama3-8b-instruct that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
123+
```bash
124+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
125+
```
126+
127+
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
128+
129+
```bash
130+
kubectl get httproute llm-route -o yaml
131+
```
132+
133+
5. To install an InferencePool named vllm-llama3-8b-instruct that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
120134

121135
```bash
122136
helm install vllm-llama3-8b-instruct \
@@ -128,6 +142,12 @@ A cluster with:
128142
129143
The Helm install automatically installs the endpoint-picker, inferencepool alongwith health check policy.
130144

145+
6. Given that the default connection timeout may be insufficient for most inference workloads, it is recommended to configure a timeout appropriate for your intended use case.
146+
```
147+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gcp-backend-policy.yaml
148+
```
149+
150+
131151
=== "Istio"
132152

133153
Please note that this feature is currently in an experimental phase and is not intended for production use.
@@ -234,6 +254,7 @@ A cluster with:
234254
kubectl get httproute llm-route -o yaml
235255
```
236256

257+
<<<<<<< HEAD
237258
=== "Agentgateway"
238259

239260
[Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for inference routing. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane.
@@ -280,6 +301,17 @@ A cluster with:
280301
```bash
281302
kubectl get httproute llm-route -o yaml
282303
```
304+
=======
305+
306+
### Deploy InferenceObjective (Optional)
307+
308+
Deploy the sample InferenceObjective which is configured to forward traffic to the `food-review-1` [LoRA adapter](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
309+
310+
```bash
311+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml
312+
```
313+
314+
>>>>>>> 8e5ae69 (Address review comments and restructure docs)
283315
284316
### Try it out
285317

0 commit comments

Comments
 (0)