kubernetes-sigs · k8s-ci-robot · Sep 18, 2025 · Sep 17, 2025 · Sep 18, 2025
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -13,6 +13,7 @@ theme:
   favicon: images/favicon-64.png
   features:
     - content.code.annotate
+    - content.code.copy
     - search.highlight
     - navigation.tabs
     - navigation.top

diff --git a/site-src/guides/index.md b/site-src/guides/index.md
@@ -319,19 +319,14 @@ Tooling:
          kubectl get httproute llm-route -o yaml
          ```
 
-### Deploy the Body Based Router Extension (Optional)
-
-    This guide shows how to get started with serving only 1 base model type per L7 URL path. If in addition, you wish to exercise model-aware routing such that more than 1 base model is served at the same L7 url path, that requires use of the (optional) Body Based Routing (BBR) extension which is described in a following section of the guide, namely the [`Serving Multiple GenAI Models`](serve-multiple-genai-models.md) section.
-
 ### Deploy InferenceObjective (Optional)
 
-   Deploy the sample InferenceObjective which allows you to specify priority of requests.
+Deploy the sample InferenceObjective which allows you to specify priority of requests.
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml
    ```
 
-
 ### Try it out
 
    Wait until the gateway is ready.
@@ -348,6 +343,10 @@ Tooling:
    }'
    ```
 
+### Deploy the Body Based Router Extension (Optional)
+
+This guide has shown how to get started with serving a single base model type per L7 URL path. If after this exercise, you wish to continue on to exercise model-aware routing such that more than 1 base model is served at the same L7 url path, that requires use of the (optional) Body Based Routing (BBR) extension which is described in a separate section of the documentation, namely the [`Serving Multiple GenAI Models`](serve-multiple-genai-models.md) section. If you wish to exercise that function, then retain the setup you have deployed so far from this guide and move on to the additional steps described in [that guide](serve-multiple-genai-models.md) or else move on to the following section to cleanup your setup. 
+
 ### Cleanup
 
    The following instructions assume you would like to cleanup ALL resources that were created in this quickstart guide.

diff --git a/site-src/guides/serve-multiple-genai-models.md b/site-src/guides/serve-multiple-genai-models.md
@@ -83,7 +83,7 @@ We also want to use an InferencePool and EndPoint Picker for this second model i
     oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
     ```
 
-After executing this, very that you see two InferencePools and two EPP pods, one per base model type, running without errors, using the CLIs `kubectl get inferencepools` and `kubectl get pods`.
+After executing this, verify that you see two InferencePools and two EPP pods, one per base model type, running without errors, using the CLIs `kubectl get inferencepools` and `kubectl get pods`.
 
 ### Configure HTTPRoute
 
@@ -100,7 +100,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
 ```
 
 ```yaml
----
+---   
 apiVersion: gateway.networking.k8s.io/v1
 kind: HTTPRoute
 metadata:
@@ -121,11 +121,12 @@ spec:
         value: /
       headers:
         - type: Exact
+          #Body-Based routing(https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) is being used to copy the model name from the request body to the header.
           name: X-Gateway-Model-Name # (1)!
           value: 'meta-llama/Llama-3.1-8B-Instruct'
     timeouts:
       request: 300s
----
+---   
 apiVersion: gateway.networking.k8s.io/v1
 kind: HTTPRoute
 metadata:
@@ -146,14 +147,15 @@ spec:
         value: /
       headers:
         - type: Exact
-          name: X-Gateway-Model-Name # (2)!
+          #Body-Based routing(https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) is being used to copy the model name from the request body to the header.
+          name: X-Gateway-Model-Name
           value: 'microsoft/Phi-4-mini-instruct'
     timeouts:
       request: 300s
----
+---   
 ```
 
-Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True` for both routes:
+Before testing the setup, confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True` for both routes using the following commands. 
 
 ```bash
 kubectl get httproute llm-llama-route -o yaml
@@ -163,8 +165,6 @@ kubectl get httproute llm-llama-route -o yaml
 kubectl get httproute llm-phi4-route -o yaml
 ```
 
-[BBR](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) is being used to copy the model name from the request body to the header with key `X-Gateway-Model-Name`. The header can then be used in the `HTTPRoute` to route requests to different `InferencePool` instances.
-
 ## Try it out
 
 1. Get the gateway IP: