Skip to content

Commit fe5883f

Browse files
committed
Add Install Gateway section in Getting Started guide
- Move instructions from the Deploy an Inference Gateway section describing installation of Gateway API CRDs and provider specific GWs Signed-off-by: Dharaneeshwaran Ravichandran <[email protected]>
1 parent 123ad68 commit fe5883f

File tree

1 file changed

+91
-82
lines changed

1 file changed

+91
-82
lines changed

site-src/guides/index.md

Lines changed: 91 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,84 @@ Tooling:
8383
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
8484
```
8585

86+
### Install the Gateway
87+
88+
1. Requirements
89+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) Installed.
90+
91+
Choose one of the following options to install Gateway.
92+
93+
=== "GKE"
94+
95+
1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
96+
97+
See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway) for detailed instructions.
98+
99+
=== "Istio"
100+
101+
Please note that this feature is currently in an experimental phase and is not intended for production use.
102+
The implementation and user experience are subject to changes as we continue to iterate on this project.
103+
104+
1. Install Istio
105+
106+
```
107+
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
108+
# on Linux
109+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
110+
tar -xvf istioctl-$TAG-linux-amd64.tar.gz
111+
# on macOS
112+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
113+
tar -xvf istioctl-$TAG-osx.tar.gz
114+
# on Windows
115+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
116+
unzip istioctl-$TAG-win.zip
117+
118+
./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
119+
```
120+
121+
=== "Kgateway"
122+
123+
[Kgateway](https://kgateway.dev/) added Inference Gateway support as a **technical preview** in the
124+
[v2.0.0 release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.0.0). InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
125+
126+
1. Requirements
127+
128+
- [Helm](https://helm.sh/docs/intro/install/) installed.
129+
130+
2. Set the Kgateway version and install the Kgateway CRDs.
131+
132+
```bash
133+
KGTW_VERSION=v2.1.0-main
134+
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
135+
```
136+
137+
3. Install Kgateway
138+
139+
```bash
140+
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
141+
```
142+
143+
=== "Agentgateway"
144+
145+
[Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for Inference Gateway. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
146+
147+
1. Requirements
148+
149+
- [Helm](https://helm.sh/docs/intro/install/) installed.
150+
151+
2. Set the Kgateway version and install the Kgateway CRDs.
152+
153+
```bash
154+
KGTW_VERSION=v2.1.0-main
155+
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
156+
```
157+
158+
3. Install Kgateway
159+
160+
```bash
161+
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true
162+
```
163+
86164
### Deploy the InferencePool and Endpoint Picker Extension
87165

88166
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, inferencepool along with provider specific resources.
@@ -135,17 +213,13 @@ Tooling:
135213
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
136214
```
137215

138-
### Deploy an Inference Gateway
216+
### Deploy the Inference Gateway
139217

140218
Choose one of the following options to deploy an Inference Gateway.
141219

142220
=== "GKE"
143221

144-
1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
145-
See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
146-
for detailed instructions.
147-
148-
2. Deploy Inference Gateway:
222+
1. Deploy the Inference Gateway:
149223

150224
```bash
151225
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
@@ -158,45 +232,21 @@ Tooling:
158232
NAME CLASS ADDRESS PROGRAMMED AGE
159233
inference-gateway inference-gateway <MY_ADDRESS> True 22s
160234
```
161-
3. Deploy the HTTPRoute
235+
2. Deploy the HTTPRoute
162236

163237
```bash
164238
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
165239
```
166240

167-
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
241+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
168242

169243
```bash
170244
kubectl get httproute llm-route -o yaml
171245
```
172246

173247
=== "Istio"
174248

175-
Please note that this feature is currently in an experimental phase and is not intended for production use.
176-
The implementation and user experience are subject to changes as we continue to iterate on this project.
177-
178-
1. Requirements
179-
180-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
181-
182-
2. Install Istio
183-
184-
```
185-
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
186-
# on Linux
187-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
188-
tar -xvf istioctl-$TAG-linux-amd64.tar.gz
189-
# on macOS
190-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
191-
tar -xvf istioctl-$TAG-osx.tar.gz
192-
# on Windows
193-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
194-
unzip istioctl-$TAG-win.zip
195-
196-
./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
197-
```
198-
199-
3. Deploy Gateway
249+
1. Deploy the Inference Gateway
200250

201251
```bash
202252
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
@@ -209,42 +259,21 @@ Tooling:
209259
inference-gateway inference-gateway <MY_ADDRESS> True 22s
210260
```
211261

212-
4. Deploy the HTTPRoute
262+
2. Deploy the HTTPRoute
213263

214264
```bash
215265
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
216266
```
217267

218-
5. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
268+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
219269

220270
```bash
221271
kubectl get httproute llm-route -o yaml
222272
```
223273

224274
=== "Kgateway"
225275

226-
[Kgateway](https://kgateway.dev/) added Inference Gateway support as a **technical preview** in the
227-
[v2.0.0 release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.0.0). InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
228-
229-
1. Requirements
230-
231-
- [Helm](https://helm.sh/docs/intro/install/) installed.
232-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
233-
234-
2. Set the Kgateway version and install the Kgateway CRDs.
235-
236-
```bash
237-
KGTW_VERSION=v2.1.0-main
238-
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
239-
```
240-
241-
3. Install Kgateway
242-
243-
```bash
244-
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
245-
```
246-
247-
4. Deploy the Gateway
276+
1. Deploy the Inference Gateway
248277

249278
```bash
250279
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
@@ -257,41 +286,21 @@ Tooling:
257286
inference-gateway kgateway <MY_ADDRESS> True 22s
258287
```
259288

260-
5. Deploy the HTTPRoute
289+
2. Deploy the HTTPRoute
261290

262291
```bash
263292
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
264293
```
265294

266-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
295+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
267296

268297
```bash
269298
kubectl get httproute llm-route -o yaml
270299
```
271300

272301
=== "Agentgateway"
273302

274-
[Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for Inference Gateway. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
275-
276-
1. Requirements
277-
278-
- [Helm](https://helm.sh/docs/intro/install/) installed.
279-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
280-
281-
2. Set the Kgateway version and install the Kgateway CRDs.
282-
283-
```bash
284-
KGTW_VERSION=v2.1.0-main
285-
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
286-
```
287-
288-
3. Install Kgateway
289-
290-
```bash
291-
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true
292-
```
293-
294-
4. Deploy the Gateway
303+
1. Deploy the Inference Gateway
295304

296305
```bash
297306
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
@@ -304,13 +313,13 @@ Tooling:
304313
inference-gateway agentgateway <MY_ADDRESS> True 22s
305314
```
306315

307-
5. Deploy the HTTPRoute
316+
2. Deploy the HTTPRoute
308317

309318
```bash
310319
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
311320
```
312321

313-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
322+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
314323

315324
```bash
316325
kubectl get httproute llm-route -o yaml

0 commit comments

Comments
 (0)