Skip to content

Commit 526aa57

Browse files
srampalahg-g
andauthored
Fix some markdown formatting errors (#1609)
* Fix some markdown formatting errors * Update site-src/guides/index.md Co-authored-by: Abdullah Gharaibeh <[email protected]> --------- Co-authored-by: Abdullah Gharaibeh <[email protected]>
1 parent deccda0 commit 526aa57

File tree

3 files changed

+14
-14
lines changed

3 files changed

+14
-14
lines changed

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ theme:
1313
favicon: images/favicon-64.png
1414
features:
1515
- content.code.annotate
16+
- content.code.copy
1617
- search.highlight
1718
- navigation.tabs
1819
- navigation.top

site-src/guides/index.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -318,19 +318,14 @@ Tooling:
318318
kubectl get httproute llm-route -o yaml
319319
```
320320

321-
### Deploy the Body Based Router Extension (Optional)
322-
323-
This guide shows how to get started with serving only 1 base model type per L7 URL path. If in addition, you wish to exercise model-aware routing such that more than 1 base model is served at the same L7 url path, that requires use of the (optional) Body Based Routing (BBR) extension which is described in a following section of the guide, namely the [`Serving Multiple GenAI Models`](serve-multiple-genai-models.md) section.
324-
325321
### Deploy InferenceObjective (Optional)
326322

327-
Deploy the sample InferenceObjective which allows you to specify priority of requests.
323+
Deploy the sample InferenceObjective which allows you to specify priority of requests.
328324

329325
```bash
330326
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml
331327
```
332328

333-
334329
### Try it out
335330

336331
Wait until the gateway is ready.
@@ -347,6 +342,10 @@ Tooling:
347342
}'
348343
```
349344

345+
### Deploy the Body Based Router Extension (Optional)
346+
347+
This guide has shown how to get started with serving a single base model type per L7 URL path. If after this exercise, you wish to continue on to exercise model-aware routing such that more than 1 base model is served at the same L7 url path, that requires use of the (optional) Body Based Routing (BBR) extension which is described in a separate section of the documentation, namely the [`Serving Multiple GenAI Models`](serve-multiple-genai-models.md) section. If you wish to exercise that function, then retain the setup you have deployed so far from this guide and move on to the additional steps described in [that guide](serve-multiple-genai-models.md) or else move on to the following section to cleanup your setup.
348+
350349
### Cleanup
351350

352351
The following instructions assume you would like to cleanup ALL resources that were created in this quickstart guide.

site-src/guides/serve-multiple-genai-models.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ We also want to use an InferencePool and EndPoint Picker for this second model i
8383
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
8484
```
8585

86-
After executing this, very that you see two InferencePools and two EPP pods, one per base model type, running without errors, using the CLIs `kubectl get inferencepools` and `kubectl get pods`.
86+
After executing this, verify that you see two InferencePools and two EPP pods, one per base model type, running without errors, using the CLIs `kubectl get inferencepools` and `kubectl get pods`.
8787

8888
### Configure HTTPRoute
8989

@@ -100,7 +100,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
100100
```
101101

102102
```yaml
103-
---
103+
---
104104
apiVersion: gateway.networking.k8s.io/v1
105105
kind: HTTPRoute
106106
metadata:
@@ -121,11 +121,12 @@ spec:
121121
value: /
122122
headers:
123123
- type: Exact
124+
#Body-Based routing(https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) is being used to copy the model name from the request body to the header.
124125
name: X-Gateway-Model-Name # (1)!
125126
value: 'meta-llama/Llama-3.1-8B-Instruct'
126127
timeouts:
127128
request: 300s
128-
---
129+
---
129130
apiVersion: gateway.networking.k8s.io/v1
130131
kind: HTTPRoute
131132
metadata:
@@ -146,14 +147,15 @@ spec:
146147
value: /
147148
headers:
148149
- type: Exact
149-
name: X-Gateway-Model-Name # (2)!
150+
#Body-Based routing(https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) is being used to copy the model name from the request body to the header.
151+
name: X-Gateway-Model-Name
150152
value: 'microsoft/Phi-4-mini-instruct'
151153
timeouts:
152154
request: 300s
153-
---
155+
---
154156
```
155157

156-
Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True` for both routes:
158+
Before testing the setup, confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True` for both routes using the following commands.
157159

158160
```bash
159161
kubectl get httproute llm-llama-route -o yaml
@@ -163,8 +165,6 @@ kubectl get httproute llm-llama-route -o yaml
163165
kubectl get httproute llm-phi4-route -o yaml
164166
```
165167

166-
[BBR](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) is being used to copy the model name from the request body to the header with key `X-Gateway-Model-Name`. The header can then be used in the `HTTPRoute` to route requests to different `InferencePool` instances.
167-
168168
## Try it out
169169

170170
1. Get the gateway IP:

0 commit comments

Comments
 (0)