Skip to content

Commit dcc7972

Browse files
committed
Add support note to local LLM docs
1 parent 3477459 commit dcc7972

File tree

2 files changed

+11
-2
lines changed

2 files changed

+11
-2
lines changed

solutions/observability/connect-to-own-local-llm.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,17 @@ products:
1111

1212
# Connect to your own local LLM
1313

14+
:::{important}
15+
Elastic doesn’t support the setup and configuration of local LLMs. The example provided is for reference only.
16+
Before using a local LLM, evaluate its performance according to the [LLM performance matrix](./llm-performance-matrix.md#evaluate-your-own-model).
17+
:::
18+
1419
This page provides instructions for setting up a connector to a large language model (LLM) of your choice using LM Studio. This allows you to use your chosen model within the {{obs-ai-assistant}}. You’ll first need to set up LM Studio, then download and deploy a model via LM studio and finally configure the connector in your Elastic deployment.
1520

1621
::::{note}
1722
If your Elastic deployment is not on the same network, you must configure an Nginx reverse proxy to authenticate with Elastic. Refer to [Configure your reverse proxy](https://www.elastic.co/docs/solutions/security/ai/connect-to-own-local-llm#_configure_your_reverse_proxy) for more detailed instructions.
1823

19-
You do not have to set up a proxy if LM Studio is running locally, or on the same network as your Elastic deployment.
24+
You do not have to set up a proxy if LM Studio is running locally, or on the same network as your Elastic deployment.
2025
::::
2126

2227
::::{note}
@@ -85,7 +90,7 @@ Once you’ve downloaded a model, use the following commands in your CLI:
8590
4. Load a model: `lms load llama-3.3-70b-instruct --context-length 64000 --gpu max`.
8691

8792
::::{important}
88-
When loading a model, use the `--context-length` flag with a context window of 64,000 or higher.
93+
When loading a model, use the `--context-length` flag with a context window of 64,000 or higher.
8994
Optionally, you can set how much to offload to the GPU by using the `--gpu` flag. `--gpu max` will offload all layers to GPU.
9095
::::
9196

solutions/observability/observability-ai-assistant.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,10 @@ While the {{obs-ai-assistant}} is compatible with many different models, refer t
102102
:::
103103

104104
### Connect to a custom local LLM
105+
```{applies_to}
106+
serverless: ga
107+
stack: ga 9.2
108+
```
105109

106110
[Connect to LM Studio](/solutions/observability/connect-to-own-local-llm.md) to use a custom LLM deployed and managed by you.
107111

0 commit comments

Comments
 (0)