From dcc7972eb1837e5a0f9ee604bb53b2dd5355a2f6 Mon Sep 17 00:00:00 2001
From: Mike Birnstiehl <michael.birnstiehl@elastic.co>
Date: Mon, 20 Oct 2025 09:05:30 -0500
Subject: [PATCH] Add support note to local LLM docs

---
 solutions/observability/connect-to-own-local-llm.md   | 9 +++++++--
 solutions/observability/observability-ai-assistant.md | 4 ++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/solutions/observability/connect-to-own-local-llm.md b/solutions/observability/connect-to-own-local-llm.md
index 9c8a5feb82..4c591339c8 100644
--- a/solutions/observability/connect-to-own-local-llm.md
+++ b/solutions/observability/connect-to-own-local-llm.md
@@ -11,12 +11,17 @@ products:
 
 # Connect to your own local LLM
 
+:::{important}
+Elastic doesn’t support the setup and configuration of local LLMs. The example provided is for reference only.
+Before using a local LLM, evaluate its performance according to the [LLM performance matrix](./llm-performance-matrix.md#evaluate-your-own-model).
+:::
+
 This page provides instructions for setting up a connector to a large language model (LLM) of your choice using LM Studio. This allows you to use your chosen model within the {{obs-ai-assistant}}. You’ll first need to set up LM Studio, then download and deploy a model via LM studio and finally configure the connector in your Elastic deployment.
 
 ::::{note}
 If your Elastic deployment is not on the same network, you must configure an Nginx reverse proxy to authenticate with Elastic. Refer to [Configure your reverse proxy](https://www.elastic.co/docs/solutions/security/ai/connect-to-own-local-llm#_configure_your_reverse_proxy) for more detailed instructions.
 
-You do not have to set up a proxy if LM Studio is running locally, or on the same network as your Elastic deployment. 
+You do not have to set up a proxy if LM Studio is running locally, or on the same network as your Elastic deployment.
 ::::
 
 ::::{note}
@@ -85,7 +90,7 @@ Once you’ve downloaded a model, use the following commands in your CLI:
 4. Load a model: `lms load llama-3.3-70b-instruct --context-length 64000 --gpu max`.
 
 ::::{important}
-When loading a model, use the `--context-length` flag with a context window of 64,000 or higher. 
+When loading a model, use the `--context-length` flag with a context window of 64,000 or higher.
 Optionally, you can set how much to offload to the GPU by using the `--gpu` flag. `--gpu max` will offload all layers to GPU.
 ::::
 
diff --git a/solutions/observability/observability-ai-assistant.md b/solutions/observability/observability-ai-assistant.md
index a4affc57cd..b7bc731f35 100644
--- a/solutions/observability/observability-ai-assistant.md
+++ b/solutions/observability/observability-ai-assistant.md
@@ -102,6 +102,10 @@ While the {{obs-ai-assistant}} is compatible with many different models, refer t
 :::
 
 ### Connect to a custom local LLM
+```{applies_to}
+serverless: ga
+stack: ga 9.2
+```
 
 [Connect to LM Studio](/solutions/observability/connect-to-own-local-llm.md) to use a custom LLM deployed and managed by you.