Skip to content

Commit 9434b1d

Browse files
authored
feat(agent): iterative tool-loop + loop-limit (#126)
* feat(agent): iterative tool-loop + loop-limit (ll) (default 3) Execute model-requested tool calls iteratively; configurable via `loop-limit` (`ll`) to avoid infinite loops. * feat: adding loop-limit to autocomplete dic, also alphabet ordered * chore(ux): improving some messages * docs(agent): featuring agent mode in main README.md * fix: adding missing arg for available tools in the follow up chat params * docs(agent): adding agent mode quick demo made with asciicast * chore(release): bump version to 0.22.0
1 parent d33a9e6 commit 9434b1d

File tree

9 files changed

+160
-43
lines changed

9 files changed

+160
-43
lines changed

README.md

Lines changed: 35 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@
3232
- [Usage](#usage)
3333
- [Command-line Arguments](#command-line-arguments)
3434
- [Usage Examples](#usage-examples)
35+
- [How Tool Calls Work](#how-tool-calls-work)
36+
-**NEW** [Agent Mode](#agent-mode)
3537
- [Interactive Commands](#interactive-commands)
3638
- [Tool and Server Selection](#tool-and-server-selection)
3739
- [Model Selection](#model-selection)
@@ -56,6 +58,7 @@ MCP Client for Ollama (`ollmcp`) is a modern, interactive terminal application (
5658

5759
## Features
5860

61+
- 🤖 **Agent Mode**: Iterative tool execution when models request multiple tool calls, with a configurable loop limit to prevent infinite loops
5962
- 🌐 **Multi-Server Support**: Connect to multiple MCP servers simultaneously
6063
- 🚀 **Multiple Transport Types**: Supports STDIO, SSE, and Streamable HTTP server connections
6164
- ☁️ **Ollama Cloud Support**: Works seamlessly with Ollama Cloud models for tool calling, enabling access to powerful cloud-hosted models while using local MCP tools
@@ -244,24 +247,24 @@ During chat, use these commands:
244247
245248
| Command | Shortcut | Description |
246249
|------------------|------------------|-----------------------------------------------------|
250+
| `clear` | `cc` | Clear conversation history and context |
251+
| `cls` | `clear-screen` | Clear the terminal screen |
252+
| `context` | `c` | Toggle context retention |
253+
| `context-info` | `ci` | Display context statistics |
247254
| `help` | `h` | Display help and available commands |
248-
| `tools` | `t` | Open the tool selection interface |
255+
| `human-in-loop` | `hil` | Toggle Human-in-the-Loop confirmations for tool execution |
256+
| `load-config` | `lc` | Load tool and model configuration from a file |
257+
| `loop-limit` | `ll` | Set maximum iterative tool-loop iterations (Agent Mode). Default: 3 |
249258
| `model` | `m` | List and select a different Ollama model |
250259
| `model-config` | `mc` | Configure advanced model parameters and system prompt|
251-
| `context` | `c` | Toggle context retention |
252-
| `thinking-mode` | `tm` | Toggle thinking mode (e.g., gpt-oss, deepseek-r1, qwen3) |
260+
| `quit`, `exit`, `bye` | `q` or `Ctrl+D` | Exit the client |
261+
| `reload-servers` | `rs` | Reload all MCP servers with current configuration |
262+
| `reset-config` | `rc` | Reset configuration to defaults (all tools enabled) |
263+
| `save-config` | `sc` | Save current tool and model configuration to a file |
264+
| `show-metrics` | `sm` | Toggle performance metrics display |
253265
| `show-thinking` | `st` | Toggle thinking text visibility |
254266
| `show-tool-execution` | `ste` | Toggle tool execution display visibility |
255-
| `show-metrics` | `sm` | Toggle performance metrics display |
256-
| `human-in-loop` | `hil` | Toggle Human-in-the-Loop confirmations for tool execution |
257-
| `clear` | `cc` | Clear conversation history and context |
258-
| `context-info` | `ci` | Display context statistics |
259-
| `cls` | `clear-screen` | Clear the terminal screen |
260-
| `save-config` | `sc` | Save current tool and model configuration to a file |
261-
| `load-config` | `lc` | Load tool and model configuration from a file |
262-
| `reset-config` | `rc` | Reset configuration to defaults (all tools enabled) |
263-
| `reload-servers` | `rs` | Reload all MCP servers with current configuration |
264-
| `quit`, `exit`, `bye` | `q` or `Ctrl+D` | Exit the client |
267+
| `tools` | `t` | Open the tool selection interface |
265268
266269
267270
### Tool and Server Selection
@@ -621,13 +624,30 @@ For more information about Ollama Cloud, visit the [Ollama Cloud documentation](
621624
1. The client sends your query to Ollama with a list of available tools
622625
2. If Ollama decides to use a tool, the client:
623626
- Displays the tool execution with formatted arguments and syntax highlighting
624-
- **NEW**: Shows a Human-in-the-Loop confirmation prompt (if enabled) allowing you to review and approve the tool call
627+
- Shows a Human-in-the-Loop confirmation prompt (if enabled) allowing you to review and approve the tool call
625628
- Extracts the tool name and arguments from the model response
626629
- Calls the appropriate MCP server with these arguments (only if approved or HIL is disabled)
627630
- Shows the tool response in a structured, easy-to-read format
628-
- Sends the tool result back to Ollama for final processing
631+
- Sends the tool result back to Ollama
632+
- If in Agent Mode, repeats the process if the model requests more tool calls
633+
3. Finally, the client:
629634
- Displays the model's final response incorporating the tool results
630635
636+
### Agent Mode
637+
638+
Some models may request multiple tool calls in a single conversation. The client supports an **Agent Mode** that allows for iterative tool execution:
639+
- When the model requests a tool call, the client executes it and sends the result back to the model
640+
- This process repeats until the model provides a final answer or reaches the configured loop limit
641+
- You can set the maximum number of iterations using the `loop-limit` (`ll`) command
642+
- The default loop limit is `3` to prevent infinite loops
643+
644+
> [!NOTE]
645+
> If you want to prevent using Agent Mode, simply set the loop limit to `1`.
646+
647+
#### Agent Mode Quick Demo:
648+
649+
[![asciicast](https://asciinema.org/a/476qpEamCX9TFQt4jNEXIgHxS.svg)](https://asciinema.org/a/476qpEamCX9TFQt4jNEXIgHxS)
650+
631651
## Where Can I Find More MCP Servers?
632652
633653
You can explore a collection of MCP servers in the official [MCP Servers repository](https://github.com/modelcontextprotocol/servers).

cli-package/pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "ollmcp"
3-
version = "0.21.0"
3+
version = "0.22.0"
44
description = "CLI for MCP Client for Ollama - An easy-to-use command for interacting with Ollama through MCP"
55
readme = "README.md"
66
requires-python = ">=3.10"
@@ -9,7 +9,7 @@ authors = [
99
{name = "Jonathan Löwenstern"}
1010
]
1111
dependencies = [
12-
"mcp-client-for-ollama==0.21.0"
12+
"mcp-client-for-ollama==0.22.0"
1313
]
1414

1515
[project.scripts]

mcp_client_for_ollama/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
"""MCP Client for Ollama package."""
22

3-
__version__ = "0.21.0"
3+
__version__ = "0.22.0"

mcp_client_for_ollama/client.py

Lines changed: 90 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ def __init__(self, model: str = DEFAULT_MODEL, host: str = DEFAULT_OLLAMA_HOST):
6969
self.show_tool_execution = True # By default, show tool execution displays
7070
# Metrics display settings
7171
self.show_metrics = False # By default, don't show metrics after each query
72+
# Agent mode settings
73+
self.loop_limit = 3 # Maximum follow-up tool loops per query
7274
self.default_configuration_status = False # Track if default configuration was loaded successfully
7375

7476
# Store server connection parameters for reloading
@@ -259,7 +261,8 @@ async def process_query(self, query: str) -> str:
259261
}
260262

261263
# Add thinking parameter if thinking mode is enabled and model supports it
262-
if await self.supports_thinking_mode():
264+
supports_thinking = await self.supports_thinking_mode()
265+
if supports_thinking:
263266
chat_params["think"] = self.thinking_mode
264267

265268
# Initial Ollama API call with the query and available tools
@@ -286,9 +289,26 @@ async def process_query(self, query: str) -> str:
286289
# Update actual token count from metrics if available
287290
if metrics and metrics.get('eval_count'):
288291
self.actual_token_count += metrics['eval_count']
289-
# Check if there are any tool calls in the response
290-
if len(tool_calls) > 0 and self.tool_manager.get_enabled_tool_objects():
291-
for tool in tool_calls:
292+
293+
enabled_tools = self.tool_manager.get_enabled_tool_objects()
294+
295+
loop_count = 0
296+
pending_tool_calls = tool_calls
297+
298+
# Keep looping while the model requests tools and we have capacity
299+
while pending_tool_calls and enabled_tools:
300+
if loop_count >= self.loop_limit:
301+
self.console.print(Panel(
302+
f"[yellow]Your current loop limit is set to [bold]{self.loop_limit}[/bold] and has been reached. Skipping additional tool calls.[/yellow]\n"
303+
f"You will probably want to increase this limit if your model requires more tool interactions to complete tasks.\n"
304+
f"You can change the loop limit with the [bold cyan]loop-limit[/bold cyan] command.",
305+
title="[bold]Loop Limit Reached[/bold]", border_style="yellow", expand=False
306+
))
307+
break
308+
309+
loop_count += 1
310+
311+
for tool in pending_tool_calls:
292312
tool_name = tool.function.name
293313
tool_args = tool.function.arguments
294314

@@ -338,27 +358,39 @@ async def process_query(self, query: str) -> str:
338358
"model": model,
339359
"messages": messages,
340360
"stream": True,
361+
"tools": available_tools,
341362
"options": model_options
342363
}
343364

344365
# Add thinking parameter if thinking mode is enabled and model supports it
345-
if await self.supports_thinking_mode():
366+
if supports_thinking:
346367
chat_params_followup["think"] = self.thinking_mode
347368

348369
stream = await self.ollama.chat(**chat_params_followup)
349370

350371
# Process the streaming response with thinking mode support
351-
response_text, _, followup_metrics = await self.streaming_manager.process_streaming_response(
372+
followup_response, pending_tool_calls, followup_metrics = await self.streaming_manager.process_streaming_response(
352373
stream,
353374
thinking_mode=self.thinking_mode,
354375
show_thinking=self.show_thinking,
355376
show_metrics=self.show_metrics
356377
)
357378

379+
messages.append({
380+
"role": "assistant",
381+
"content": followup_response,
382+
"tool_calls": pending_tool_calls
383+
})
384+
358385
# Update actual token count from followup metrics if available
359386
if followup_metrics and followup_metrics.get('eval_count'):
360387
self.actual_token_count += followup_metrics['eval_count']
361388

389+
if followup_response:
390+
response_text = followup_response
391+
392+
enabled_tools = self.tool_manager.get_enabled_tool_objects()
393+
362394
if not response_text:
363395
self.console.print("[red]No content response received.[/red]")
364396
response_text = ""
@@ -458,6 +490,10 @@ async def chat_loop(self):
458490
await self.toggle_show_thinking()
459491
continue
460492

493+
if query.lower() in ['loop-limit', 'll']:
494+
await self.set_loop_limit()
495+
continue
496+
461497
if query.lower() in ['show-tool-execution', 'ste']:
462498
self.toggle_show_tool_execution()
463499
continue
@@ -565,6 +601,9 @@ def print_help(self):
565601
"• Type [bold]show-thinking[/bold] or [bold]st[/bold] to toggle thinking text visibility\n"
566602
"• Type [bold]show-metrics[/bold] or [bold]sm[/bold] to toggle performance metrics display\n\n"
567603

604+
"[bold cyan]Agent Mode:[/bold cyan] [bold magenta](New!)[/bold magenta]\n"
605+
"• Type [bold]loop-limit[/bold] or [bold]ll[/bold] to set the maximum tool loop iterations\n\n"
606+
568607
"[bold cyan]MCP Servers and Tools:[/bold cyan]\n"
569608
"• Type [bold]tools[/bold] or [bold]t[/bold] to configure tools\n"
570609
"• Type [bold]show-tool-execution[/bold] or [bold]ste[/bold] to toggle tool execution display\n"
@@ -671,6 +710,28 @@ def toggle_show_metrics(self):
671710
else:
672711
self.console.print("[cyan]🔇 Performance metrics will be hidden for a cleaner output.[/cyan]")
673712

713+
async def set_loop_limit(self):
714+
"""Configure the maximum number of follow-up tool loops per query."""
715+
user_input = await self.get_user_input(f"Loop limit (current: {self.loop_limit})")
716+
717+
if user_input is None:
718+
return
719+
720+
value = user_input.strip()
721+
722+
if not value:
723+
self.console.print("[yellow]Loop limit unchanged.[/yellow]")
724+
return
725+
726+
try:
727+
new_limit = int(value)
728+
if new_limit < 1:
729+
raise ValueError
730+
self.loop_limit = new_limit
731+
self.console.print(f"[green]🤖 Agent loop limit set to {self.loop_limit}![/green]")
732+
except ValueError:
733+
self.console.print("[red]Invalid loop limit. Please enter a positive integer.[/red]")
734+
674735
def clear_context(self):
675736
"""Clear conversation history and token count"""
676737
original_history_length = len(self.chat_history)
@@ -695,6 +756,7 @@ def display_context_stats(self):
695756
f"{thinking_status}"
696757
f"Tool execution display: [{'green' if self.show_tool_execution else 'red'}]{'Enabled' if self.show_tool_execution else 'Disabled'}[/{'green' if self.show_tool_execution else 'red'}]\n"
697758
f"Performance metrics: [{'green' if self.show_metrics else 'red'}]{'Enabled' if self.show_metrics else 'Disabled'}[/{'green' if self.show_metrics else 'red'}]\n"
759+
f"Agent loop limit: [cyan]{self.loop_limit}[/cyan]\n"
698760
f"Human-in-the-Loop confirmations: [{'green' if self.hil_manager.is_enabled() else 'red'}]{'Enabled' if self.hil_manager.is_enabled() else 'Disabled'}[/{'green' if self.hil_manager.is_enabled() else 'red'}]\n"
699761
f"Conversation entries: {history_count}\n"
700762
f"Total tokens generated: {self.actual_token_count:,}",
@@ -730,6 +792,9 @@ def save_configuration(self, config_name=None):
730792
"thinkingMode": self.thinking_mode,
731793
"showThinking": self.show_thinking
732794
},
795+
"agentSettings": {
796+
"loopLimit": self.loop_limit
797+
},
733798
"modelConfig": self.model_config_manager.get_config(),
734799
"displaySettings": {
735800
"showToolExecution": self.show_tool_execution,
@@ -787,6 +852,14 @@ def load_configuration(self, config_name=None):
787852
if "showThinking" in config_data["modelSettings"]:
788853
self.show_thinking = config_data["modelSettings"]["showThinking"]
789854

855+
if "agentSettings" in config_data:
856+
if "loopLimit" in config_data["agentSettings"]:
857+
try:
858+
loop_limit = int(config_data["agentSettings"]["loopLimit"])
859+
self.loop_limit = max(1, loop_limit)
860+
except (TypeError, ValueError):
861+
pass
862+
790863
# Load model configuration if specified
791864
if "modelConfig" in config_data:
792865
self.model_config_manager.set_config(config_data["modelConfig"])
@@ -833,6 +906,17 @@ def reset_configuration(self):
833906
# Default show thinking to True if not specified
834907
self.show_thinking = True
835908

909+
if "agentSettings" in config_data:
910+
if "loopLimit" in config_data["agentSettings"]:
911+
try:
912+
self.loop_limit = max(1, int(config_data["agentSettings"]["loopLimit"]))
913+
except (TypeError, ValueError):
914+
self.loop_limit = 3
915+
else:
916+
self.loop_limit = 3
917+
else:
918+
self.loop_limit = 3
919+
836920
# Reset display settings from the default configuration
837921
if "displaySettings" in config_data:
838922
if "showToolExecution" in config_data["displaySettings"]:

mcp_client_for_ollama/config/defaults.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ def default_config() -> dict:
2323
"thinkingMode": True,
2424
"showThinking": False
2525
},
26+
"agentSettings": {
27+
"loopLimit": 3
28+
},
2629
"modelConfig": {
2730
"system_prompt": "",
2831
"num_keep": None,

mcp_client_for_ollama/config/manager.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,8 @@ def reset_configuration(self) -> Dict[str, Any]:
150150
"• All tools enabled\n"
151151
"• Context retention enabled\n"
152152
"• Thinking mode enabled\n"
153-
"• Thinking text hidden",
153+
"• Thinking text hidden\n"
154+
"• Agent loop limit reset",
154155
title="Config Reset", border_style="green", expand=False
155156
))
156157

@@ -211,6 +212,14 @@ def _validate_config(self, config_data: Dict[str, Any]) -> Dict[str, Any]:
211212
if "showThinking" in config_data["modelSettings"]:
212213
validated["modelSettings"]["showThinking"] = bool(config_data["modelSettings"]["showThinking"])
213214

215+
if "agentSettings" in config_data and isinstance(config_data["agentSettings"], dict):
216+
if "loopLimit" in config_data["agentSettings"]:
217+
try:
218+
loop_limit = int(config_data["agentSettings"]["loopLimit"])
219+
validated["agentSettings"]["loopLimit"] = max(1, loop_limit)
220+
except (TypeError, ValueError):
221+
pass
222+
214223
if "modelConfig" in config_data and isinstance(config_data["modelConfig"], dict):
215224
model_config = config_data["modelConfig"]
216225
if "system_prompt" in model_config:

mcp_client_for_ollama/utils/constants.py

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -27,26 +27,27 @@
2727

2828
# Interactive commands and their descriptions for autocomplete
2929
INTERACTIVE_COMMANDS = {
30-
'tools': 'Configure available tools',
31-
'help': 'Show help information',
32-
'model': 'Select Ollama model',
33-
'model-config': 'Configure model parameters',
34-
'context': 'Toggle context retention',
35-
'thinking-mode': 'Toggle thinking mode',
36-
'show-thinking': 'Toggle thinking visibility',
37-
'show-tool-execution': 'Toggle tool execution display',
38-
'show-metrics': 'Toggle performance metrics display',
30+
'bye': 'Exit the application',
31+
'clear-screen': 'Clear terminal screen',
3932
'clear': 'Clear conversation context',
4033
'context-info': 'Show context information',
41-
'clear-screen': 'Clear terminal screen',
42-
'save-config': 'Save current configuration',
43-
'load-config': 'Load saved configuration',
44-
'reset-config': 'Reset to default config',
45-
'reload-servers': 'Reload MCP servers',
34+
'context': 'Toggle context retention',
35+
'exit': 'Exit the application',
36+
'help': 'Show help information',
4637
'human-in-the-loop': 'Toggle HIL confirmations',
38+
'load-config': 'Load saved configuration',
39+
'loop-limit': 'Set agent max loop limit',
40+
'model-config': 'Configure model parameters',
41+
'model': 'Select Ollama model',
4742
'quit': 'Exit the application',
48-
'exit': 'Exit the application',
49-
'bye': 'Exit the application'
43+
'reload-servers': 'Reload MCP servers',
44+
'reset-config': 'Reset to default config',
45+
'save-config': 'Save current configuration',
46+
'show-metrics': 'Toggle performance metrics display',
47+
'show-thinking': 'Toggle thinking visibility',
48+
'show-tool-execution': 'Toggle tool execution display',
49+
'thinking-mode': 'Toggle thinking mode',
50+
'tools': 'Configure available tools'
5051
}
5152

5253
# Default completion menu style (used by prompt_toolkit in interactive mode)

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "mcp-client-for-ollama"
3-
version = "0.21.0"
3+
version = "0.22.0"
44
description = "MCP Client for Ollama - A client for connecting to Model Context Protocol servers using Ollama"
55
readme = "README.md"
66
requires-python = ">=3.10"

uv.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)