You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: solutions/observability/connect-to-own-local-llm.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -44,7 +44,7 @@ After you’ve opened the application for the first time using the GUI, you can
44
44
Once you’ve launched LM Studio:
45
45
46
46
1. Go to LM Studio’s Discover window.
47
-
2. Search for an LLM (for example, `Mistral-Nemo-Instruct-2407`). Your chosen model must include `instruct` in its name in order to work with Elastic.
47
+
2. Search for an LLM (for example, `Llama 3.3`). Your chosen model must include `instruct` in its name (specified in download options) in order to work with Elastic.
48
48
3. When selecting a model, models published by verified authors are recommended (indicated by the purple verification badge next to the model name).
49
49
4. After you find a model, view download options and select a recommended option (green). For best performance, select one with the thumbs-up icon that indicates good performance on your hardware.
50
50
5. Download one or more models.
@@ -57,7 +57,7 @@ For security reasons, before downloading a model, verify that it is from a trust
57
57
:alt: The LM Studio model selection interface with download options
58
58
:::
59
59
60
-
This [`mistralai/mistral-nemo-instruct-2407`](https://lmstudio.ai/models/mistralai/mistral-nemo-instruct-2407) model used in this example has 12B total parameters, a 128,000 token context window, and uses GGUF [quanitization](https://huggingface.co/docs/transformers/main/en/quantization/overview). For more information about model names and format information, refer to the following table.
60
+
This [`llama-3.3-70b-instruct`](https://lmstudio.ai/models/meta/llama-3.3-70b) model used in this example has 70B total parameters, a 128,000 token context window, and uses GGUF [quanitization](https://huggingface.co/docs/transformers/main/en/quantization/overview). For more information about model names and format information, refer to the following table.
61
61
62
62
| Model Name | Parameter Size | Tokens/Context Window | Quantization Format |
63
63
| --- | --- | --- | --- |
@@ -79,7 +79,7 @@ Once you’ve downloaded a model, use the following commands in your CLI:
79
79
1. Verify LM Studio is installed: `lms`
80
80
2. Check LM Studio’s status: `lms status`
81
81
3. List all downloaded models: `lms ls`
82
-
4. Load a model: `lms load mistralai/mistral-nemo-instruct-2407 --context-length 64000`.
When loading a model, use the `--context-length` flag with a context window of 64,000 or higher.
@@ -114,14 +114,14 @@ Once the model is downloaded, it will appear in the "My Models" window in LM Stu
114
114
115
115
1. Navigate to the Developer window.
116
116
2. Click on the "Start server" toggle on the top left. Once the server is started, you'll see the address and port of the server. The port will be defaulted to 1234.
117
-
3. Click on "Select a model to load" and pick the model `Mistral Nemo Instruct 2407` from the dropdown menu.
117
+
3. Click on "Select a model to load" and pick the model `Llama 3.3 70B Instruct` from the dropdown menu.
118
118
4. Navigate to the "Load" on the right side of the LM Studio window, to adjust the context window to 64,000. Reload the model to apply the changes.
119
119
120
120
::::{note}
121
121
To enable other devices in the same network access the server, turn on "Serve on Local Network" via Settings.
@@ -134,16 +134,16 @@ Finally, configure the connector:
134
134
3. Name your connector to help keep track of the model version you are using.
135
135
4. Under **Select an OpenAI provider**, select **Other (OpenAI Compatible Service)**.
136
136
5. Under **URL**, enter the host's IP address and port, followed by `/v1/chat/completions`. (If you have a reverse proxy set up, enter the domain name specified in your Nginx configuration file followed by `/v1/chat/completions`.)
137
-
6. Under **Default model**, enter `mistralai/mistral-nemo-instruct-2407`.
137
+
6. Under **Default model**, enter `llama-3.3-70b-instruct`.
138
138
7. Under **API key**, fill in anything. (If you have a reverse proxy set up, enter the secret token specified in your Nginx configuration file.)
Setup is now complete. You can use the model you’ve loaded in LM Studio to power Elastic’s generative AI features.
146
146
147
147
::::{note}
148
-
While local (open-weight) LLMs offer greater privacy and control, they generally do not match the raw performance and advanced reasoning capabilities of proprietary models by LLM providers mentioned in [here](/solutions/observability/observability-ai-assistant.md)
148
+
While local (open-weight) LLMs offer greater privacy and control, they generally do not match the raw performance and advanced reasoning capabilities of proprietary models by LLM providers mentioned in [here](/solutions/observability/observability-ai-assistant.md#obs-ai-set-up)
0 commit comments