You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/AI-for-security/connect-to-byo.asciidoc
+20-16Lines changed: 20 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ This page provides instructions for setting up a connector to a large language m
10
10
11
11
This example uses a single server hosted in GCP to run the following components:
12
12
13
-
* LM Studio with the https://mistral.ai/technology/#models[Mixtral-8x7b] model
13
+
* LM Studio with the https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407[Mistral-Nemo-Instruct-2407] model
14
14
* A reverse proxy using Nginx to authenticate to Elastic Cloud
15
15
16
16
image::images/lms-studio-arch-diagram.png[Architecture diagram for this guide]
@@ -20,7 +20,7 @@ NOTE: For testing, you can use alternatives to Nginx such as https://learn.micro
20
20
[discrete]
21
21
== Configure your reverse proxy
22
22
23
-
NOTE: If your Elastic instance is on the same host as LM Studio, you can skip this step.
23
+
NOTE: If your Elastic instance is on the same host as LM Studio, you can skip this step. Also, check out our https://www.elastic.co/blog/herding-llama-3-1-with-elastic-and-lm-studio[blog post] that walks through the whole process of setting up a single-host implementation.
24
24
25
25
You need to set up a reverse proxy to enable communication between LM Studio and Elastic. For more complete instructions, refer to a guide such as https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-as-a-reverse-proxy-on-ubuntu-22-04[this one].
IMPORTANT: If using the example configuration file above, you must replace several values: Replace `<secret token>` with your actual token, and keep it safe since you'll need it to set up the {elastic-sec} connector. Replace `<yourdomainname.com>` with your actual domain name. Update the `proxy_pass` value at the bottom of the configuration if you decide to change the port number in LM Studio to something other than 1234.
77
+
[IMPORTANT]
78
+
====
79
+
If using the example configuration file above, you must replace several values:
80
+
81
+
* Replace `<secret token>` with your actual token, and keep it safe since you'll need it to set up the {elastic-sec} connector.
82
+
* Replace `<yourdomainname.com>` with your actual domain name.
83
+
* Update the `proxy_pass` value at the bottom of the configuration if you decide to change the port number in LM Studio to something other than 1234.
84
+
====
78
85
79
86
[discrete]
80
87
=== (Optional) Set up performance monitoring for your reverse proxy
@@ -85,23 +92,20 @@ You can use Elastic's {integrations-docs}/nginx[Nginx integration] to monitor pe
85
92
86
93
First, install https://lmstudio.ai/[LM Studio]. LM Studio supports the OpenAI SDK, which makes it compatible with Elastic's OpenAI connector, allowing you to connect to any model available in the LM Studio marketplace.
87
94
88
-
One current limitation of LM Studio is that when it is installed on a server, you must launch the application using its GUI before doing so using the CLI. For example, by using Chrome RDP with an https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine[X Window System]. After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI.
95
+
You must launch the application using its GUI before doing so using the CLI. For example, use Chrome RDP with an https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine[X Window System]. After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI.
89
96
90
97
Once you've launched LM Studio:
91
98
92
99
1. Go to LM Studio's Search window.
93
-
2. Search for an LLM (for example, `Mixtral-8x7B-instruct`). Your chosen model must include `instruct` in its name in order to work with Elastic.
94
-
3. Filter your search for "Compatibility Guess" to optimize results for your hardware. Results will be color coded:
95
-
* Green means "Full GPU offload possible", which yields the best results.
96
-
* Blue means "Partial GPU offload possible", which may work.
97
-
* Red for "Likely too large for this machine", which typically will not work.
100
+
2. Search for an LLM (for example, `Mistral-Nemo-Instruct-2407`). Your chosen model must include `instruct` in its name in order to work with Elastic.
101
+
3. After you find a model, view download options and select a recommended version (green). For best performance, select one with the thumbs-up icon that indicates good performance on your hardware.
98
102
4. Download one or more models.
99
103
100
104
IMPORTANT: For security reasons, before downloading a model, verify that it is from a trusted source. It can be helpful to review community feedback on the model (for example using a site like Hugging Face).
101
105
102
106
image::images/lms-model-select.png[The LM Studio model selection interface]
103
107
104
-
In this example we used https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF[`TheBloke/Mixtral-8x7B-Instruct-v0.1.Q3_K_M.gguf`]. It has 46.7B total parameters, a 32,000 token context window, and uses GGUF https://huggingface.co/docs/transformers/main/en/quantization/overview[quanitization]. For more information about model names and format information, refer to the following table.
108
+
In this example we used https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407[`mistralai/Mistral-Nemo-Instruct-2407`]. It has 12B total parameters, a 128,000 token context window, and uses GGUF https://huggingface.co/docs/transformers/main/en/quantization/overview[quanitization]. For more information about model names and format information, refer to the following table.
105
109
106
110
[cols="1,1,1,1", options="header"]
107
111
|===
@@ -124,18 +128,18 @@ After downloading a model, load it in LM Studio using the GUI or LM Studio's htt
124
128
[discrete]
125
129
=== Option 1: load a model using the CLI (Recommended)
126
130
127
-
It is a best practice to download models from the marketplace using the GUI, and then load or unload them using the CLI. The GUI allows you to search for models, whereas the CLI only allows you to import specific paths, but the CLI provides a good interface for loading and unloading.
131
+
It is a best practice to download models from the marketplace using the GUI, and then load or unload them using the CLI. The GUI allows you to search for models, whereas the CLI allows you to use `lms get` to search for models. The CLI provides a good interface for loading and unloading.
128
132
129
-
Use the following commands in your CLI:
133
+
Once you've downloaded a model, use the following commands in your CLI:
130
134
131
135
1. Verify LM Studio is installed: `lms`
132
136
2. Check LM Studio's status: `lms status`
133
137
3. List all downloaded models: `lms ls`
134
-
4. Load a model: `lms load`
138
+
4. Load a model: `lms load`.
135
139
136
140
image::images/lms-cli-welcome.png[The CLI interface during execution of initial LM Studio commands]
137
141
138
-
After the model loads, you should see a `Model loaded successfully` message in the CLI.
142
+
After the model loads, you should see a `Model loaded successfully` message in the CLI.
139
143
140
144
image::images/lms-studio-model-loaded-msg.png[The CLI message that appears after a model loads]
141
145
@@ -156,8 +160,8 @@ Refer to the following video to see how to load a model using LM Studio's GUI. Y
0 commit comments