Skip to content

Commit 144fead

Browse files
mergify[bot]benironsidegithub-actions[bot]
authored
[8.16] [ESS] [Serverless] Updates BYO LLM page (backport #6326) (#6364)
* [ESS] [Serverless] Updates BYO LLM page (#6326) * updates BYO LLM page * fix link error * fixes broken serverless link * troubleshoot * fixes another broken serverless link * updates images and video * Update docs/AI-for-security/connect-to-byo.asciidoc Co-authored-by: Nastasha Solomon <[email protected]> * Update docs/serverless/AI-for-security/connect-to-byo-llm.asciidoc * Update docs/AI-for-security/connect-to-byo.asciidoc --------- Co-authored-by: Nastasha Solomon <[email protected]> (cherry picked from commit 900040b) # Conflicts: # docs/serverless/AI-for-security/connect-to-byo-llm.asciidoc # docs/serverless/AI-for-security/images/lms-cli-welcome.png # docs/serverless/AI-for-security/images/lms-model-select.png # docs/serverless/AI-for-security/images/lms-ps-command.png # docs/serverless/AI-for-security/images/lms-studio-model-loaded-msg.png # docs/serverless/AI-for-security/llm-connector-guides.asciidoc * Delete docs/serverless directory and its contents --------- Co-authored-by: Benjamin Ironside Goldstein <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 8d034bc commit 144fead

File tree

5 files changed

+20
-16
lines changed

5 files changed

+20
-16
lines changed

docs/AI-for-security/connect-to-byo.asciidoc

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This page provides instructions for setting up a connector to a large language m
1010

1111
This example uses a single server hosted in GCP to run the following components:
1212

13-
* LM Studio with the https://mistral.ai/technology/#models[Mixtral-8x7b] model
13+
* LM Studio with the https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407[Mistral-Nemo-Instruct-2407] model
1414
* A reverse proxy using Nginx to authenticate to Elastic Cloud
1515

1616
image::images/lms-studio-arch-diagram.png[Architecture diagram for this guide]
@@ -20,7 +20,7 @@ NOTE: For testing, you can use alternatives to Nginx such as https://learn.micro
2020
[discrete]
2121
== Configure your reverse proxy
2222

23-
NOTE: If your Elastic instance is on the same host as LM Studio, you can skip this step.
23+
NOTE: If your Elastic instance is on the same host as LM Studio, you can skip this step. Also, check out our https://www.elastic.co/blog/herding-llama-3-1-with-elastic-and-lm-studio[blog post] that walks through the whole process of setting up a single-host implementation.
2424

2525
You need to set up a reverse proxy to enable communication between LM Studio and Elastic. For more complete instructions, refer to a guide such as https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-as-a-reverse-proxy-on-ubuntu-22-04[this one].
2626

@@ -74,7 +74,14 @@ server {
7474
}
7575
--------------------------------------------------
7676

77-
IMPORTANT: If using the example configuration file above, you must replace several values: Replace `<secret token>` with your actual token, and keep it safe since you'll need it to set up the {elastic-sec} connector. Replace `<yourdomainname.com>` with your actual domain name. Update the `proxy_pass` value at the bottom of the configuration if you decide to change the port number in LM Studio to something other than 1234.
77+
[IMPORTANT]
78+
====
79+
If using the example configuration file above, you must replace several values:
80+
81+
* Replace `<secret token>` with your actual token, and keep it safe since you'll need it to set up the {elastic-sec} connector.
82+
* Replace `<yourdomainname.com>` with your actual domain name.
83+
* Update the `proxy_pass` value at the bottom of the configuration if you decide to change the port number in LM Studio to something other than 1234.
84+
====
7885

7986
[discrete]
8087
=== (Optional) Set up performance monitoring for your reverse proxy
@@ -85,23 +92,20 @@ You can use Elastic's {integrations-docs}/nginx[Nginx integration] to monitor pe
8592

8693
First, install https://lmstudio.ai/[LM Studio]. LM Studio supports the OpenAI SDK, which makes it compatible with Elastic's OpenAI connector, allowing you to connect to any model available in the LM Studio marketplace.
8794

88-
One current limitation of LM Studio is that when it is installed on a server, you must launch the application using its GUI before doing so using the CLI. For example, by using Chrome RDP with an https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine[X Window System]. After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI.
95+
You must launch the application using its GUI before doing so using the CLI. For example, use Chrome RDP with an https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine[X Window System]. After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI.
8996

9097
Once you've launched LM Studio:
9198

9299
1. Go to LM Studio's Search window.
93-
2. Search for an LLM (for example, `Mixtral-8x7B-instruct`). Your chosen model must include `instruct` in its name in order to work with Elastic.
94-
3. Filter your search for "Compatibility Guess" to optimize results for your hardware. Results will be color coded:
95-
* Green means "Full GPU offload possible", which yields the best results.
96-
* Blue means "Partial GPU offload possible", which may work.
97-
* Red for "Likely too large for this machine", which typically will not work.
100+
2. Search for an LLM (for example, `Mistral-Nemo-Instruct-2407`). Your chosen model must include `instruct` in its name in order to work with Elastic.
101+
3. After you find a model, view download options and select a recommended version (green). For best performance, select one with the thumbs-up icon that indicates good performance on your hardware.
98102
4. Download one or more models.
99103

100104
IMPORTANT: For security reasons, before downloading a model, verify that it is from a trusted source. It can be helpful to review community feedback on the model (for example using a site like Hugging Face).
101105

102106
image::images/lms-model-select.png[The LM Studio model selection interface]
103107

104-
In this example we used https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF[`TheBloke/Mixtral-8x7B-Instruct-v0.1.Q3_K_M.gguf`]. It has 46.7B total parameters, a 32,000 token context window, and uses GGUF https://huggingface.co/docs/transformers/main/en/quantization/overview[quanitization]. For more information about model names and format information, refer to the following table.
108+
In this example we used https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407[`mistralai/Mistral-Nemo-Instruct-2407`]. It has 12B total parameters, a 128,000 token context window, and uses GGUF https://huggingface.co/docs/transformers/main/en/quantization/overview[quanitization]. For more information about model names and format information, refer to the following table.
105109

106110
[cols="1,1,1,1", options="header"]
107111
|===
@@ -124,18 +128,18 @@ After downloading a model, load it in LM Studio using the GUI or LM Studio's htt
124128
[discrete]
125129
=== Option 1: load a model using the CLI (Recommended)
126130

127-
It is a best practice to download models from the marketplace using the GUI, and then load or unload them using the CLI. The GUI allows you to search for models, whereas the CLI only allows you to import specific paths, but the CLI provides a good interface for loading and unloading.
131+
It is a best practice to download models from the marketplace using the GUI, and then load or unload them using the CLI. The GUI allows you to search for models, whereas the CLI allows you to use `lms get` to search for models. The CLI provides a good interface for loading and unloading.
128132

129-
Use the following commands in your CLI:
133+
Once you've downloaded a model, use the following commands in your CLI:
130134

131135
1. Verify LM Studio is installed: `lms`
132136
2. Check LM Studio's status: `lms status`
133137
3. List all downloaded models: `lms ls`
134-
4. Load a model: `lms load`
138+
4. Load a model: `lms load`.
135139

136140
image::images/lms-cli-welcome.png[The CLI interface during execution of initial LM Studio commands]
137141

138-
After the model loads, you should see a `Model loaded successfully` message in the CLI.
142+
After the model loads, you should see a `Model loaded successfully` message in the CLI.
139143

140144
image::images/lms-studio-model-loaded-msg.png[The CLI message that appears after a model loads]
141145

@@ -156,8 +160,8 @@ Refer to the following video to see how to load a model using LM Studio's GUI. Y
156160
<img
157161
style="width: 100%; margin: auto; display: block;"
158162
class="vidyard-player-embed"
159-
src="https://play.vidyard.com/FMx2wxGQhquWPVhGQgjkyM.jpg"
160-
data-uuid="FMx2wxGQhquWPVhGQgjkyM"
163+
src="https://play.vidyard.com/c4AxH8d9tWMnwNp5J6bcfX.jpg"
164+
data-uuid="c4AxH8d9tWMnwNp5J6bcfX"
161165
data-v="4"
162166
data-type="inline"
163167
/>
344 KB
Loading
-454 KB
Loading
197 KB
Loading
-128 Bytes
Loading

0 commit comments

Comments
 (0)