Skip to content

Commit 1ff499b

Browse files
committed
feat: add mcp server inference
1 parent 0ecabf2 commit 1ff499b

File tree

14 files changed

+351
-26
lines changed

14 files changed

+351
-26
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@
66
- The project uses `uv`, `ruff` and `mypy`
77
- Run commands should be prefixed with `uv`: `uv run ...`
88
- Use `asyncio` features, if such is needed
9+
- Prefer early returns
910
- Absolutely no useless comments! Every class and method does not need to be documented (unless it is legitimetly complex or "lib-ish")

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ make chat
3030
make dev
3131
```
3232

33-
Additional MCP servers are configured in `agent-chat-cli.config.yaml` and prompts added within the `prompts` folder.
33+
Additional MCP servers are configured in `agent-chat-cli.config.yaml` and prompts added within the `prompts` folder. By default, MCP servers are loaded dynamically via inference; set `mcp_server_inference: false` to load all servers at startup.
3434

3535
## Development
3636

agent-chat-cli.config.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ model: haiku
99
# Enable streaming responses
1010
include_partial_messages: true
1111

12+
# Enable dynamic MCP server inference
13+
mcp_server_inference: true
14+
1215
# Named agents with custom configurations
1316
# agents:
1417
# sample_agent:

src/agent_chat_cli/app.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@
88
from agent_chat_cli.components.chat_history import ChatHistory, MessagePosted
99
from agent_chat_cli.components.thinking_indicator import ThinkingIndicator
1010
from agent_chat_cli.components.user_input import UserInput
11-
from agent_chat_cli.utils import AgentLoop
12-
from agent_chat_cli.utils.message_bus import MessageBus
11+
from agent_chat_cli.system.agent_loop import AgentLoop
12+
from agent_chat_cli.system.message_bus import MessageBus
13+
from agent_chat_cli.system.actions import Actions
1314
from agent_chat_cli.utils.logger import setup_logging
14-
from agent_chat_cli.utils.actions import Actions
1515

1616
from dotenv import load_dotenv
1717

@@ -20,7 +20,7 @@
2020

2121

2222
class AgentChatCLIApp(App):
23-
CSS_PATH = "utils/styles.tcss"
23+
CSS_PATH = "system/styles.tcss"
2424

2525
BINDINGS = [
2626
Binding("ctrl+c", "quit", "Quit", show=False, priority=True),

src/agent_chat_cli/components/user_input.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
from agent_chat_cli.components.chat_history import MessagePosted
1010
from agent_chat_cli.components.thinking_indicator import ThinkingIndicator
1111
from agent_chat_cli.components.messages import Message
12-
from agent_chat_cli.utils.actions import Actions
12+
from agent_chat_cli.system.actions import Actions
1313
from agent_chat_cli.utils.enums import ControlCommand
1414

1515

src/agent_chat_cli/docs/architecture.md

Lines changed: 72 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,24 +16,34 @@ Textual widgets responsible for UI rendering:
1616
- **UserInput**: Handles user text input and submission
1717
- **ThinkingIndicator**: Shows when agent is processing
1818

19-
### Utils Layer
19+
### System Layer
2020

21-
#### Agent Loop (`agent_loop.py`)
21+
#### Agent Loop (`system/agent_loop.py`)
2222
Manages the conversation loop with Claude SDK:
2323
- Maintains async queue for user queries
2424
- Handles streaming responses
2525
- Parses SDK messages into structured AgentMessage objects
2626
- Emits AgentMessageType events (STREAM_EVENT, ASSISTANT, RESULT)
2727
- Manages session persistence via session_id
28+
- Supports dynamic MCP server inference and loading
29+
30+
#### MCP Server Inference (`system/mcp_inference.py`)
31+
Intelligently determines which MCP servers are needed for each query:
32+
- Uses a persistent Haiku client for fast inference (~1-3s after initial boot)
33+
- Analyzes user queries to infer required servers
34+
- Maintains a cached set of inferred servers across conversation
35+
- Returns only newly needed servers to minimize reconnections
36+
- Can be disabled via `mcp_server_inference: false` config option
2837

29-
#### Message Bus (`message_bus.py`)
38+
#### Message Bus (`system/message_bus.py`)
3039
Routes agent messages to appropriate UI components:
3140
- Handles streaming text updates
3241
- Mounts tool use messages
3342
- Controls thinking indicator state
3443
- Manages scroll-to-bottom behavior
44+
- Displays system messages (e.g., MCP server connection notifications)
3545

36-
#### Actions (`actions.py`)
46+
#### Actions (`system/actions.py`)
3747
Centralizes all user-initiated actions and controls:
3848
- **quit()**: Exits the application
3949
- **query(user_input)**: Sends user query to agent loop queue
@@ -46,15 +56,20 @@ Actions are triggered via:
4656
- Keybindings in app.py (ESC → action_interrupt, Ctrl+N → action_new)
4757
- Text commands in user_input.py ("exit", "clear")
4858

49-
#### Config (`config.py`)
59+
### Utils Layer
60+
61+
#### Config (`utils/config.py`)
5062
Loads and validates YAML configuration:
5163
- Filters disabled MCP servers
5264
- Loads prompts from files
5365
- Expands environment variables
5466
- Combines system prompt with MCP server prompts
67+
- Provides `get_sdk_config()` to filter app-specific config before passing to SDK
5568

5669
## Data Flow
5770

71+
### Standard Query Flow (with MCP Inference enabled)
72+
5873
```
5974
User Input
6075
@@ -64,7 +79,16 @@ MessagePosted event → ChatHistory (immediate UI update)
6479
6580
Actions.query(user_input) → AgentLoop.query_queue.put()
6681
67-
Claude SDK (streaming response)
82+
AgentLoop: MCP Server Inference (if enabled)
83+
84+
infer_mcp_servers(user_message) → Haiku query
85+
86+
If new servers needed:
87+
- Post SYSTEM message ("Connecting to [servers]...")
88+
- Disconnect client
89+
- Reconnect with new servers (preserving session_id)
90+
91+
Claude SDK (streaming response with connected MCP tools)
6892
6993
AgentLoop._handle_message
7094
@@ -73,9 +97,26 @@ AgentMessage (typed message) → MessageBus.handle_agent_message
7397
Match on AgentMessageType:
7498
- STREAM_EVENT → Update streaming message widget
7599
- ASSISTANT → Mount tool use widgets
100+
- SYSTEM → Display system notification
76101
- RESULT → Reset thinking indicator
77102
```
78103

104+
### Query Flow (with MCP Inference disabled)
105+
106+
```
107+
User Input
108+
109+
UserInput.on_input_submitted
110+
111+
MessagePosted event → ChatHistory (immediate UI update)
112+
113+
Actions.query(user_input) → AgentLoop.query_queue.put()
114+
115+
Claude SDK (all servers pre-connected at startup)
116+
117+
[Same as above from _handle_message onwards]
118+
```
119+
79120
### Control Commands Flow
80121
```
81122
User Action (ESC, Ctrl+N, "clear", "exit")
@@ -138,14 +179,38 @@ class Message:
138179
Configuration is loaded from `agent-chat-cli.config.yaml`:
139180
- **system_prompt**: Base system prompt (supports file paths)
140181
- **model**: Claude model to use
141-
- **include_partial_messages**: Enable streaming
182+
- **include_partial_messages**: Enable streaming responses (default: true)
183+
- **mcp_server_inference**: Enable dynamic MCP server inference (default: true)
184+
- When `true`: App boots instantly without MCP servers, connects only when needed
185+
- When `false`: All enabled MCP servers load at startup (traditional behavior)
142186
- **mcp_servers**: MCP server configurations (filtered by enabled flag)
143187
- **agents**: Named agent configurations
144188
- **disallowed_tools**: Tool filtering
145189
- **permission_mode**: Permission handling mode
146190

147191
MCP server prompts are automatically appended to the system prompt.
148192

193+
### MCP Server Inference
194+
195+
When `mcp_server_inference: true` (default):
196+
197+
1. **Fast Boot**: App starts without connecting to any MCP servers
198+
2. **Smart Detection**: Before each query, Haiku analyzes which servers are needed
199+
3. **Dynamic Loading**: Only connects to newly required servers
200+
4. **Session Preservation**: Maintains conversation history when reconnecting with new servers
201+
5. **Performance**: ~1-3s inference latency after initial boot (first query ~8-12s)
202+
203+
Example config:
204+
```yaml
205+
mcp_server_inference: true # or false to disable
206+
207+
mcp_servers:
208+
github:
209+
description: "Search code, PRs, issues"
210+
enabled: true
211+
# ... rest of config
212+
```
213+
149214
## User Commands
150215

151216
### Text Commands
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
from agent_chat_cli.utils.agent_loop import AgentLoop
1+
from agent_chat_cli.system.agent_loop import AgentLoop
22
from agent_chat_cli.utils.enums import ControlCommand
33
from agent_chat_cli.components.chat_history import ChatHistory
44
from agent_chat_cli.components.thinking_indicator import ThinkingIndicator
Lines changed: 75 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,14 @@
1313
ToolUseBlock,
1414
)
1515

16-
from agent_chat_cli.utils.config import load_config
16+
from agent_chat_cli.utils.config import (
17+
load_config,
18+
get_available_servers,
19+
get_sdk_config,
20+
)
1721
from agent_chat_cli.utils.enums import AgentMessageType, ContentType, ControlCommand
22+
from agent_chat_cli.system.mcp_inference import infer_mcp_servers
23+
from agent_chat_cli.utils.logger import log_json
1824

1925

2026
@dataclass
@@ -31,34 +37,93 @@ def __init__(
3137
) -> None:
3238
self.config = load_config()
3339
self.session_id = session_id
40+
self.available_servers = get_available_servers()
41+
self.inferred_servers: set[str] = set()
3442

35-
config_dict = self.config.model_dump()
36-
if session_id:
37-
config_dict["resume"] = session_id
38-
39-
self.client = ClaudeSDKClient(options=ClaudeAgentOptions(**config_dict))
43+
self.client: ClaudeSDKClient
4044

4145
self.on_message = on_message
4246
self.query_queue: asyncio.Queue[str | ControlCommand] = asyncio.Queue()
4347

4448
self._running = False
4549
self.interrupting = False
4650

47-
async def start(self) -> None:
51+
async def _initialize_client(self, mcp_servers: dict) -> None:
52+
sdk_config = get_sdk_config(self.config)
53+
sdk_config["mcp_servers"] = mcp_servers
54+
55+
if self.session_id:
56+
sdk_config["resume"] = self.session_id
57+
58+
self.client = ClaudeSDKClient(options=ClaudeAgentOptions(**sdk_config))
59+
4860
await self.client.connect()
4961

62+
async def start(self) -> None:
63+
if self.config.mcp_server_inference:
64+
await self._initialize_client(mcp_servers={})
65+
else:
66+
mcp_servers = {
67+
name: config.model_dump()
68+
for name, config in self.available_servers.items()
69+
}
70+
71+
await self._initialize_client(mcp_servers=mcp_servers)
72+
5073
self._running = True
5174

5275
while self._running:
5376
user_input = await self.query_queue.get()
5477

5578
if isinstance(user_input, ControlCommand):
5679
if user_input == ControlCommand.NEW_CONVERSATION:
80+
self.inferred_servers.clear()
81+
5782
await self.client.disconnect()
58-
await self.client.connect()
83+
84+
if self.config.mcp_server_inference:
85+
await self._initialize_client(mcp_servers={})
86+
else:
87+
mcp_servers = {
88+
name: config.model_dump()
89+
for name, config in self.available_servers.items()
90+
}
91+
92+
await self._initialize_client(mcp_servers=mcp_servers)
5993
continue
6094

95+
if self.config.mcp_server_inference:
96+
inference_result = await infer_mcp_servers(
97+
user_message=user_input,
98+
available_servers=self.available_servers,
99+
inferred_servers=self.inferred_servers,
100+
session_id=self.session_id,
101+
)
102+
103+
if inference_result["new_servers"]:
104+
server_list = ", ".join(inference_result["new_servers"])
105+
106+
await self.on_message(
107+
AgentMessage(
108+
type=AgentMessageType.SYSTEM,
109+
data=f"Connecting to {server_list}...",
110+
)
111+
)
112+
113+
await asyncio.sleep(0.1)
114+
115+
await self.client.disconnect()
116+
117+
mcp_servers = {
118+
name: config.model_dump()
119+
for name, config in inference_result["selected_servers"].items()
120+
}
121+
122+
await self._initialize_client(mcp_servers=mcp_servers)
123+
61124
self.interrupting = False
125+
126+
# Send query
62127
await self.client.query(user_input)
63128

64129
async for message in self.client.receive_response():
@@ -71,6 +136,8 @@ async def start(self) -> None:
71136

72137
async def _handle_message(self, message: Any) -> None:
73138
if isinstance(message, SystemMessage):
139+
log_json(message.data)
140+
74141
if message.subtype == AgentMessageType.INIT.value and message.data.get(
75142
"session_id"
76143
):

0 commit comments

Comments
 (0)