Add LLM sections

juntao · juntao · commit 080b837284da · 2025-12-07T22:49:48.000-06:00
Signed-off-by: Michael Yuan &lt;michael@secondstate.io&gt;
diff --git a/doc/docs/config/llm-tools.md b/doc/docs/config/llm-tools.md
@@ -4,4 +4,154 @@ sidebar_position: 4
 
 # LLM with tools
 
+The [responses API](https://platform.openai.com/docs/api-reference/responses) is pioneered by OpenAI
+to support advanced LLM features such as tool use and code interpreter. 
+It is recommened to use the LLM provider's built-in tools with the responses API.
+
+## A simple example
+
+The following `config.toml` example shows how to use OpenAI's responses API.
+Since it is a stateful API, the EchoKit server only needs to send the last user query. The LLM prrovider (OpenAI in this case) manages the conversation history.
+
+```toml
+[llm]
+llm_chat_url = "https://api.openai.com/v1/responses"
+api_key = "sk_ABCD"
+model = "gpt-5-nano"
+
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete English sentences as if you are in a conversation.
+
+"""
+```
+
+## Web search
+
+With the OpenAI provider, you can use the built-in `web_search_preview` tool. 
+OpenAI will first determine is the current user query requires a web search. If so, it will
+perform the search first, and then use the LLM to generate the response based on the search results.
+THe search results will also be included in the LLM history for subsequent user queries.
+
+Below is an example for OpenAI. It adds an extra tool called `web_search_preview`, and instructs the LLM to use it.
+The actual implementation of the `web_search_preview` tool is provided by OpenAI itself.
+
+```toml
+[llm]
+llm_chat_url = "https://api.openai.com/v1/responses"
+api_key = "sk_ABCD"
+model = "gpt-5-nano"
+
+[[llm.extra.tools]]
+type = "web_search_preview"
+
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete English sentences as if you are in a conversation.
+- Use the web_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
+
+"""
+```
+
+Other providers offer similar web search tools that you can pass in `[[llm.extra.tools]]`.
+Below is an example for xAI Grok's responses API. As you can see, it could also support search filters. Grok also
+provides a `x_search` tool to specifically search for posts in x.com.
+
+```toml
+[llm]
+llm_chat_url = "https://api.x.ai/v1/responses"
+api_key = "xai_ABCD"
+model = "grok-4-1-fast-non-reasoning"
+
+[[llm.extra.tools]]
+type = "web_search"
+# filters = { "allowed_domains" = ["wikipedia.org"] }
+
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete English sentences as if you are in a conversation.
+- Use the web_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
+
+"""
+```
+
+Below is an example of Groq's responses API.
+Again the name of the build-in search tool is different. It is called `browser_search` in Groq.
+
+```toml
+[llm]
+llm_chat_url = "https://api.groq.com/openai/v1/chat/responses"
+api_key = "gsk_ABCD"
+model = "openai/gpt-oss-20b"
+
+[[llm.extra.tools]]
+type = "browser_search"
+
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete English sentences as if you are in a conversation.
+- Use the browser_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
+
+"""
+```
+
+## MCP tools
+
+The web search functionalities are supported as built-in LLM tools. In fact, you could add your own tools
+through custom [MCP servers](https://modelcontextprotocol.io/).
+
+Here is an example `config.toml` for xAI Grok to use custom MCP servers. All the tools in that MCP server will
+be added to the LLM request. When the LLM returns a tool call for any of these tools, Grok will call the MCP
+server to execute the tool call. The tool call results are then sent to the LLM, and Grok will generate 
+a response based on those tool call results.
+
+```toml
+[llm]
+llm_chat_url = "https://api.x.ai/v1/responses"
+api_key = "xai_ABCD"
+model = "grok-4-1-fast-non-reasoning"
+
+[[llm.extra.tools]]
+type = "mcp"
+server_url = "https://mcp.deepwiki.com/mcp"
+server_label = "deepwiki"
+
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete English sentences as if you are in a conversation.
+
+"""
+```
+
+## Next step
+
+The `[[llm.extra.tools]]` can only call MCP servers that are accessible to the cloud provider. In the case of
+EchoKit server, we sometimes would like to call tools that are local to the edge server, such as home automation APIs.
+That would require us to support local MCP servers.
+
 
diff --git a/doc/docs/config/llm.md b/doc/docs/config/llm.md
@@ -4,4 +4,107 @@ sidebar_position: 3
 
 # LLM services
 
+The EchoKit server utilizes LLM services to generate responses to user queries. 
+Most popular LLM services support OpenAI's `/v1/chat/completions` API.
+
+## Simple example
+
+The following example configures the EchoKit server to use the OpenAI LLM service.
+
+```toml
+[llm]
+llm_chat_url = "https://api.openai.com/v1/chat/completions"
+api_key = "sk_ABCD"
+model = "gpt-5-nano"
+history = 5
+```
+
+Since the `/v1/chat/completions` API is stateless, the EchoKit server must send the complete chat history
+together with the new user query in every request. The `history` parameter specifies how many conversation
+turns it should include.
+
+## System prompt
+
+A key feature of modern LLMs is the system prompt, which instructs the LLM how to respond to the user query.
+You can specify the speaking style and backgriund knowledge into the system prompt. You can also
+specify the available tools (e.g., web search) and actions in the system prompt.
+
+In `config.toml`, you can give the LLM system prompt in the `[[llm.sys_prompts]]` field.
+
+```toml
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely. Always answer in English.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete English sentences as if you are in a conversation.
+
+"""
+```
+
+## Dynamic prompt
+
+EchoKit server can dynamically load web-based content into the system prompt. It is a great tool
+for updating the system prompt without having to restart the server itself. In the following example,
+the server will load the content from the URL and then replace the `{{ url }}` placeholder.
+
+```toml
+[[llm.sys_prompts]]
+role = "system"
+content = """
+{{ https://raw.githubusercontent.com/alabulei1/echokit-dynamic-prompt/refs/heads/main/prompt.txt }}
+"""
+```
+
+The dynamic system prompt is reloaded only after the EchoKit device restarts with the following events.
+
+* Power-on
+* Pressing the RST button
+
+It remains unchanged after normal interruptions or network reconnections.
+
+## Qwen web search
+
+Some LLM service providers support additional JSON structures in the request to accomplish
+additional tasks. A good example is Qwen's web search functions. When the additional JSON
+data is passed in, the Ali Cloud provider will use the LLM to decide if it needs a web search to
+answer the user query. If so, it will perform the web search and have the LLM generate a response 
+based on the search results.
+
+In the following example, we are passing the `enable_search = true` parameter to each LLM request. 
+We also tells the LLM to use the search tool when needed in the system prompt.
+
+```toml
+[llm]
+llm_chat_url = "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions"
+api_key = "sk-API-KEY"
+model = "qwen-plus"
+history = 5
+
+[llm.extra]
+enable_search = true
+
+[[llm.sys_prompts]]
+role = "system"
+content = """
+You are a helpful assistant. Answer truthfully and concisely.
+
+- NEVER use bullet points
+- NEVER use tables
+- Answer in complete sentences as if you are in a conversation.
+- Use the web_search tool if you need information about current events such as news, political figures, stock prices, and crypto prices.
+
+"""
+```
+
+You can pass any JSON parameter supported by the LLM API provider in the `[llm.extra]` field.
+
+## Next step
+
+While the stateless `/v1/chat/completions` API is widely supported, 
+OpenAI and many providers in the ecosystem have shifted their focus to the new stateful
+`/v1/responses` API. The new responses API makes it easier to support tools, icnluding web searches,
+in LLM applications. 
 
diff --git a/doc/docs/config/mcp.md b/doc/docs/config/mcp.md
@@ -5,7 +5,7 @@ sidebar_position: 5
 # Tool calls and actions
 
 EchoKit supports **MCP (Model Context Protocol)**, which allows LLMs to call external tools and actions.  
-With actions, you can extend EchoKit beyond conversation—for example:  
+With tools, you can extend EchoKit beyond voice conversations, such as,
 
 - Manage your Google Calendar  
 - Send emails  
@@ -21,7 +21,7 @@ Before adding MCP tools, make sure that you
 - Have access to an **MCP server**. You can use a public MCP server or run one locally on your machine. The MCP server can be either:  
   - **SSE MCP server**, or  
   - **HTTP streamable MCP server**  
-- Use an LLM model that is capable of **tool use** (or, **tool calling**)
+- Use an LLM model that is capable of **tool use** (or, **tool calling**). You can use both `/v1/chat/completions` or `/v1/responses` APIs.
 
 ## 2. Add MCP servers to EchoKit
 
@@ -60,7 +60,7 @@ Once your MCP server is set and the configuration updated, restart EchoKit follo
 EchoKit will now be able to call external actions via MCP.
 
 
-✅ With this setup, your EchoKit is no longer just a voice assistant—it can interact with external systems and become a **programmable AI agent**.
+✅ With this setup, your EchoKit is no longer just a chatbot — it can interact with external systems and become a **programmable AI agent**.