diff --git a/site-src/guides/serve-multiple-genai-models.md b/site-src/guides/serve-multiple-genai-models.md index a2e4e51d5..1d90767d0 100644 --- a/site-src/guides/serve-multiple-genai-models.md +++ b/site-src/guides/serve-multiple-genai-models.md @@ -12,7 +12,6 @@ The following diagram illustrates how an Inference Gateway routes requests to di The model name is extracted by [Body-Based routing](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/bbr/README.md) (BBR) from the request body to the header. The header is then matched to dispatch requests to different `InferencePool` (and their EPPs) instances. -![Serving multiple generative AI models](../images/serve-mul-gen-AI-models.png) ### Deploy Body-Based Routing diff --git a/site-src/images/inference-overview.svg b/site-src/images/inference-overview.svg index a82c09e26..8524ebbea 100644 --- a/site-src/images/inference-overview.svg +++ b/site-src/images/inference-overview.svg @@ -1 +1 @@ - \ No newline at end of file + \ No newline at end of file diff --git a/site-src/images/serve-LoRA-adapters.png b/site-src/images/serve-LoRA-adapters.png deleted file mode 100644 index e33dc708a..000000000 Binary files a/site-src/images/serve-LoRA-adapters.png and /dev/null differ diff --git a/site-src/images/serve-mul-gen-AI-models.png b/site-src/images/serve-mul-gen-AI-models.png deleted file mode 100644 index 957a054f1..000000000 Binary files a/site-src/images/serve-mul-gen-AI-models.png and /dev/null differ