You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: solutions/security/ai/connect-to-vLLM.md
+16-2Lines changed: 16 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,7 +91,7 @@ vllm/vllm-openai:v0.9.1 \
91
91
`--tensor-parallel-size 2`: This value should match the number of available GPUs (in this case, 2). This is critical for performance on multi-GPU systems.
92
92
=====
93
93
94
-
3. Verify the containers were created by running `docker ps -a`. The output should show the value you specified for the `--name` parameter.
94
+
3. Verify the container's status by running the `docker ps -a` command. The output should show the value you specified for the `--name` parameter.
95
95
96
96
## Step 3: Expose the API with a reverse proxy
97
97
@@ -144,8 +144,22 @@ Finally, create the connector within your Elastic deployment to link it to your
144
144
* For **API key**, enter the secret token you created in Step 1 and specified in your Nginx configuration file.
145
145
* If your chosen model supports tool use, then turn on **Enable native function calling**.
146
146
7. Click **Save**
147
+
8. Finally, open the **AI Assistant for Security** page using the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md).
148
+
* On the **Conversations** tab, turn off **Streaming**.
149
+
* If your model supports tool use, then on the **System prompts** page, create a new system prompt with a variation of the following prompt, to prevent your model from returning tool calls in AI Assistant conversations:
150
+
151
+
```
152
+
You are a model running under OpenAI-compatible tool calling mode.
153
+
154
+
Rules:
155
+
1. When you want to invoke a tool, never describe the call in text.
156
+
2. Always return the invocation in the `tool_calls` field.
157
+
3. The `content` field must remain empty for any assistant message that performs a tool call.
158
+
4. Only use tool calls defined in the "tools" parameter.
159
+
```
160
+
161
+
Setup is now complete. The model served by your vLLM container can now power Elastic's generative AI features.
147
162
148
-
Setup is now complete. The model served by your vLLM container can now power Elastic's generative AI features, such as the AI Assistant.
149
163
150
164
:::{note}
151
165
To run a different model, stop the current container and run a new one with an updated `--model` parameter.
0 commit comments