Skip to content

Commit 080b837

Browse files
committed
Add LLM sections
Signed-off-by: Michael Yuan <[email protected]>
1 parent d844e3b commit 080b837

File tree

3 files changed

+256
-3
lines changed

3 files changed

+256
-3
lines changed

doc/docs/config/llm-tools.md

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,154 @@ sidebar_position: 4
44

55
# LLM with tools
66

7+
The [responses API](https://platform.openai.com/docs/api-reference/responses) is pioneered by OpenAI
8+
to support advanced LLM features such as tool use and code interpreter.
9+
It is recommened to use the LLM provider's built-in tools with the responses API.
10+
11+
## A simple example
12+
13+
The following `config.toml` example shows how to use OpenAI's responses API.
14+
Since it is a stateful API, the EchoKit server only needs to send the last user query. The LLM prrovider (OpenAI in this case) manages the conversation history.
15+
16+
```toml
17+
[llm]
18+
llm_chat_url = "https://api.openai.com/v1/responses"
19+
api_key = "sk_ABCD"
20+
model = "gpt-5-nano"
21+
22+
[[llm.sys_prompts]]
23+
role = "system"
24+
content = """
25+
You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
26+
27+
- NEVER use bullet points
28+
- NEVER use tables
29+
- Answer in complete English sentences as if you are in a conversation.
30+
31+
"""
32+
```
33+
34+
## Web search
35+
36+
With the OpenAI provider, you can use the built-in `web_search_preview` tool.
37+
OpenAI will first determine is the current user query requires a web search. If so, it will
38+
perform the search first, and then use the LLM to generate the response based on the search results.
39+
THe search results will also be included in the LLM history for subsequent user queries.
40+
41+
Below is an example for OpenAI. It adds an extra tool called `web_search_preview`, and instructs the LLM to use it.
42+
The actual implementation of the `web_search_preview` tool is provided by OpenAI itself.
43+
44+
```toml
45+
[llm]
46+
llm_chat_url = "https://api.openai.com/v1/responses"
47+
api_key = "sk_ABCD"
48+
model = "gpt-5-nano"
49+
50+
[[llm.extra.tools]]
51+
type = "web_search_preview"
52+
53+
[[llm.sys_prompts]]
54+
role = "system"
55+
content = """
56+
You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
57+
58+
- NEVER use bullet points
59+
- NEVER use tables
60+
- Answer in complete English sentences as if you are in a conversation.
61+
- Use the web_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
62+
63+
"""
64+
```
65+
66+
Other providers offer similar web search tools that you can pass in `[[llm.extra.tools]]`.
67+
Below is an example for xAI Grok's responses API. As you can see, it could also support search filters. Grok also
68+
provides a `x_search` tool to specifically search for posts in x.com.
69+
70+
```toml
71+
[llm]
72+
llm_chat_url = "https://api.x.ai/v1/responses"
73+
api_key = "xai_ABCD"
74+
model = "grok-4-1-fast-non-reasoning"
75+
76+
[[llm.extra.tools]]
77+
type = "web_search"
78+
# filters = { "allowed_domains" = ["wikipedia.org"] }
79+
80+
[[llm.sys_prompts]]
81+
role = "system"
82+
content = """
83+
You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
84+
85+
- NEVER use bullet points
86+
- NEVER use tables
87+
- Answer in complete English sentences as if you are in a conversation.
88+
- Use the web_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
89+
90+
"""
91+
```
92+
93+
Below is an example of Groq's responses API.
94+
Again the name of the build-in search tool is different. It is called `browser_search` in Groq.
95+
96+
```toml
97+
[llm]
98+
llm_chat_url = "https://api.groq.com/openai/v1/chat/responses"
99+
api_key = "gsk_ABCD"
100+
model = "openai/gpt-oss-20b"
101+
102+
[[llm.extra.tools]]
103+
type = "browser_search"
104+
105+
[[llm.sys_prompts]]
106+
role = "system"
107+
content = """
108+
You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
109+
110+
- NEVER use bullet points
111+
- NEVER use tables
112+
- Answer in complete English sentences as if you are in a conversation.
113+
- Use the browser_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
114+
115+
"""
116+
```
117+
118+
## MCP tools
119+
120+
The web search functionalities are supported as built-in LLM tools. In fact, you could add your own tools
121+
through custom [MCP servers](https://modelcontextprotocol.io/).
122+
123+
Here is an example `config.toml` for xAI Grok to use custom MCP servers. All the tools in that MCP server will
124+
be added to the LLM request. When the LLM returns a tool call for any of these tools, Grok will call the MCP
125+
server to execute the tool call. The tool call results are then sent to the LLM, and Grok will generate
126+
a response based on those tool call results.
127+
128+
```toml
129+
[llm]
130+
llm_chat_url = "https://api.x.ai/v1/responses"
131+
api_key = "xai_ABCD"
132+
model = "grok-4-1-fast-non-reasoning"
133+
134+
[[llm.extra.tools]]
135+
type = "mcp"
136+
server_url = "https://mcp.deepwiki.com/mcp"
137+
server_label = "deepwiki"
138+
139+
[[llm.sys_prompts]]
140+
role = "system"
141+
content = """
142+
You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
143+
144+
- NEVER use bullet points
145+
- NEVER use tables
146+
- Answer in complete English sentences as if you are in a conversation.
147+
148+
"""
149+
```
150+
151+
## Next step
152+
153+
The `[[llm.extra.tools]]` can only call MCP servers that are accessible to the cloud provider. In the case of
154+
EchoKit server, we sometimes would like to call tools that are local to the edge server, such as home automation APIs.
155+
That would require us to support local MCP servers.
156+
7157

doc/docs/config/llm.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,107 @@ sidebar_position: 3
44

55
# LLM services
66

7+
The EchoKit server utilizes LLM services to generate responses to user queries.
8+
Most popular LLM services support OpenAI's `/v1/chat/completions` API.
9+
10+
## Simple example
11+
12+
The following example configures the EchoKit server to use the OpenAI LLM service.
13+
14+
```toml
15+
[llm]
16+
llm_chat_url = "https://api.openai.com/v1/chat/completions"
17+
api_key = "sk_ABCD"
18+
model = "gpt-5-nano"
19+
history = 5
20+
```
21+
22+
Since the `/v1/chat/completions` API is stateless, the EchoKit server must send the complete chat history
23+
together with the new user query in every request. The `history` parameter specifies how many conversation
24+
turns it should include.
25+
26+
## System prompt
27+
28+
A key feature of modern LLMs is the system prompt, which instructs the LLM how to respond to the user query.
29+
You can specify the speaking style and backgriund knowledge into the system prompt. You can also
30+
specify the available tools (e.g., web search) and actions in the system prompt.
31+
32+
In `config.toml`, you can give the LLM system prompt in the `[[llm.sys_prompts]]` field.
33+
34+
```toml
35+
[[llm.sys_prompts]]
36+
role = "system"
37+
content = """
38+
You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
39+
40+
- NEVER use bullet points
41+
- NEVER use tables
42+
- Answer in complete English sentences as if you are in a conversation.
43+
44+
"""
45+
```
46+
47+
## Dynamic prompt
48+
49+
EchoKit server can dynamically load web-based content into the system prompt. It is a great tool
50+
for updating the system prompt without having to restart the server itself. In the following example,
51+
the server will load the content from the URL and then replace the `{{ url }}` placeholder.
52+
53+
```toml
54+
[[llm.sys_prompts]]
55+
role = "system"
56+
content = """
57+
{{ https://raw.githubusercontent.com/alabulei1/echokit-dynamic-prompt/refs/heads/main/prompt.txt }}
58+
"""
59+
```
60+
61+
The dynamic system prompt is reloaded only after the EchoKit device restarts with the following events.
62+
63+
* Power-on
64+
* Pressing the RST button
65+
66+
It remains unchanged after normal interruptions or network reconnections.
67+
68+
## Qwen web search
69+
70+
Some LLM service providers support additional JSON structures in the request to accomplish
71+
additional tasks. A good example is Qwen's web search functions. When the additional JSON
72+
data is passed in, the Ali Cloud provider will use the LLM to decide if it needs a web search to
73+
answer the user query. If so, it will perform the web search and have the LLM generate a response
74+
based on the search results.
75+
76+
In the following example, we are passing the `enable_search = true` parameter to each LLM request.
77+
We also tells the LLM to use the search tool when needed in the system prompt.
78+
79+
```toml
80+
[llm]
81+
llm_chat_url = "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions"
82+
api_key = "sk-API-KEY"
83+
model = "qwen-plus"
84+
history = 5
85+
86+
[llm.extra]
87+
enable_search = true
88+
89+
[[llm.sys_prompts]]
90+
role = "system"
91+
content = """
92+
You are a helpful assistant. Answer truthfully and concisely.
93+
94+
- NEVER use bullet points
95+
- NEVER use tables
96+
- Answer in complete sentences as if you are in a conversation.
97+
- Use the web_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
98+
99+
"""
100+
```
101+
102+
You can pass any JSON parameter supported by the LLM API provider in the `[llm.extra]` field.
103+
104+
## Next step
105+
106+
While the stateless `/v1/chat/completions` API is widely supported,
107+
OpenAI and many providers in the ecosystem have shifted their focus to the new stateful
108+
`/v1/responses` API. The new responses API makes it easier to support tools, icnluding web searches,
109+
in LLM applications.
7110

doc/docs/config/mcp.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ sidebar_position: 5
55
# Tool calls and actions
66

77
EchoKit supports **MCP (Model Context Protocol)**, which allows LLMs to call external tools and actions.
8-
With actions, you can extend EchoKit beyond conversation—for example:
8+
With tools, you can extend EchoKit beyond voice conversations, such as,
99

1010
- Manage your Google Calendar
1111
- Send emails
@@ -21,7 +21,7 @@ Before adding MCP tools, make sure that you
2121
- Have access to an **MCP server**. You can use a public MCP server or run one locally on your machine. The MCP server can be either:
2222
- **SSE MCP server**, or
2323
- **HTTP streamable MCP server**
24-
- Use an LLM model that is capable of **tool use** (or, **tool calling**)
24+
- Use an LLM model that is capable of **tool use** (or, **tool calling**). You can use both `/v1/chat/completions` or `/v1/responses` APIs.
2525

2626
## 2. Add MCP servers to EchoKit
2727

@@ -60,7 +60,7 @@ Once your MCP server is set and the configuration updated, restart EchoKit follo
6060
EchoKit will now be able to call external actions via MCP.
6161

6262

63-
✅ With this setup, your EchoKit is no longer just a voice assistant—it can interact with external systems and become a **programmable AI agent**.
63+
✅ With this setup, your EchoKit is no longer just a chatbot — it can interact with external systems and become a **programmable AI agent**.
6464

6565

6666

0 commit comments

Comments
 (0)