From dcc7972eb1837e5a0f9ee604bb53b2dd5355a2f6 Mon Sep 17 00:00:00 2001 From: Mike Birnstiehl Date: Mon, 20 Oct 2025 09:05:30 -0500 Subject: [PATCH] Add support note to local LLM docs --- solutions/observability/connect-to-own-local-llm.md | 9 +++++++-- solutions/observability/observability-ai-assistant.md | 4 ++++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/solutions/observability/connect-to-own-local-llm.md b/solutions/observability/connect-to-own-local-llm.md index 9c8a5feb82..4c591339c8 100644 --- a/solutions/observability/connect-to-own-local-llm.md +++ b/solutions/observability/connect-to-own-local-llm.md @@ -11,12 +11,17 @@ products: # Connect to your own local LLM +:::{important} +Elastic doesn’t support the setup and configuration of local LLMs. The example provided is for reference only. +Before using a local LLM, evaluate its performance according to the [LLM performance matrix](./llm-performance-matrix.md#evaluate-your-own-model). +::: + This page provides instructions for setting up a connector to a large language model (LLM) of your choice using LM Studio. This allows you to use your chosen model within the {{obs-ai-assistant}}. You’ll first need to set up LM Studio, then download and deploy a model via LM studio and finally configure the connector in your Elastic deployment. ::::{note} If your Elastic deployment is not on the same network, you must configure an Nginx reverse proxy to authenticate with Elastic. Refer to [Configure your reverse proxy](https://www.elastic.co/docs/solutions/security/ai/connect-to-own-local-llm#_configure_your_reverse_proxy) for more detailed instructions. -You do not have to set up a proxy if LM Studio is running locally, or on the same network as your Elastic deployment. +You do not have to set up a proxy if LM Studio is running locally, or on the same network as your Elastic deployment. :::: ::::{note} @@ -85,7 +90,7 @@ Once you’ve downloaded a model, use the following commands in your CLI: 4. Load a model: `lms load llama-3.3-70b-instruct --context-length 64000 --gpu max`. ::::{important} -When loading a model, use the `--context-length` flag with a context window of 64,000 or higher. +When loading a model, use the `--context-length` flag with a context window of 64,000 or higher. Optionally, you can set how much to offload to the GPU by using the `--gpu` flag. `--gpu max` will offload all layers to GPU. :::: diff --git a/solutions/observability/observability-ai-assistant.md b/solutions/observability/observability-ai-assistant.md index a4affc57cd..b7bc731f35 100644 --- a/solutions/observability/observability-ai-assistant.md +++ b/solutions/observability/observability-ai-assistant.md @@ -102,6 +102,10 @@ While the {{obs-ai-assistant}} is compatible with many different models, refer t ::: ### Connect to a custom local LLM +```{applies_to} +serverless: ga +stack: ga 9.2 +``` [Connect to LM Studio](/solutions/observability/connect-to-own-local-llm.md) to use a custom LLM deployed and managed by you.