Skip to content

Commit 1ee8bf0

Browse files
#143 tools filtering logic
1 parent c66b752 commit 1ee8bf0

File tree

6 files changed

+306
-1
lines changed

6 files changed

+306
-1
lines changed

docs/en/framework/agents.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,44 @@ class ResearchSGRAgent(SGRAgent):
319319
!!! Tip "State Machine for Tool Management... Or Something More"
320320
For more complex tool management logic, you can use a more serious state engine. This will allow you to explicitly define agent states and transition rules, simplifying the management of available tools at each stage of work.
321321

322+
### Tools Filtering
323+
324+
The framework provides an intelligent tools filtering mechanism that automatically selects only relevant tools based on the user prompt. This helps reduce context size and save tokens when working with large toolkits.
325+
326+
**How it works:**
327+
328+
- **System tools** (tools with `isSystemTool=True`) are always included regardless of the prompt
329+
- **Other tools** are filtered using a combination of:
330+
- **BM25 ranking**: Semantic relevance scoring based on tool names and descriptions
331+
- **Regex matching**: Keyword matching between prompt and tool metadata
332+
333+
**Enabling tools filtering:**
334+
335+
```yaml
336+
agents:
337+
my_agent:
338+
base_class: "SGRAgent"
339+
execution:
340+
tools_filtering: true # Enable intelligent tools filtering
341+
tools:
342+
- "web_search_tool"
343+
- "extract_page_content_tool"
344+
- "create_report_tool"
345+
# ... many more tools
346+
```
347+
348+
**Example:**
349+
350+
When a user asks "Search for information about Python", the filtering system will:
351+
1. Always include system tools (e.g., `ReasoningTool`, `ClarificationTool`)
352+
2. Include `WebSearchTool` because the prompt contains "search" keyword
353+
3. Exclude irrelevant tools that don't match the prompt
354+
355+
This reduces the number of tools passed to the LLM, saving tokens and improving performance.
356+
357+
!!! Note "Default behavior"
358+
Tools filtering is disabled by default (`tools_filtering: false`). Enable it when you have many tools (>20) and want to optimize context usage.
359+
322360
### Example 2: Data Analysis Agent
323361

324362
```python

docs/ru/framework/agents.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,44 @@ class ResearchSGRAgent(SGRAgent):
319319
!!! Tip "Стейтмашина для управления инструментами... Или что-то большее"
320320
Для более сложной логики управления инструментами можно использовать более серьёзный движок состояний. Это позволит явно определить состояния агента и правила перехода между ними, что упростит управление доступными инструментами на каждом этапе работы.
321321

322+
### Фильтрация инструментов
323+
324+
Фреймворк предоставляет механизм интеллектуальной фильтрации инструментов, который автоматически выбирает только релевантные инструменты на основе промпта пользователя. Это помогает уменьшить размер контекста и сэкономить токены при работе с большими наборами инструментов.
325+
326+
**Как это работает:**
327+
328+
- **Системные инструменты** (инструменты с `isSystemTool=True`) всегда включаются независимо от промпта
329+
- **Остальные инструменты** фильтруются с использованием комбинации:
330+
- **BM25 ранжирование**: Семантическая оценка релевантности на основе названий и описаний инструментов
331+
- **Regex сопоставление**: Сопоставление ключевых слов между промптом и метаданными инструмента
332+
333+
**Включение фильтрации инструментов:**
334+
335+
```yaml
336+
agents:
337+
my_agent:
338+
base_class: "SGRAgent"
339+
execution:
340+
tools_filtering: true # Включить интеллектуальную фильтрацию инструментов
341+
tools:
342+
- "web_search_tool"
343+
- "extract_page_content_tool"
344+
- "create_report_tool"
345+
# ... много других инструментов
346+
```
347+
348+
**Пример:**
349+
350+
Когда пользователь спрашивает "Найди информацию о Python", система фильтрации:
351+
1. Всегда включает системные инструменты (например, `ReasoningTool`, `ClarificationTool`)
352+
2. Включает `WebSearchTool`, потому что промпт содержит ключевое слово "найди"
353+
3. Исключает нерелевантные инструменты, которые не соответствуют промпту
354+
355+
Это уменьшает количество инструментов, передаваемых в LLM, экономя токены и улучшая производительность.
356+
357+
!!! Note "Поведение по умолчанию"
358+
Фильтрация инструментов отключена по умолчанию (`tools_filtering: false`). Включите её, когда у вас много инструментов (>20) и вы хотите оптимизировать использование контекста.
359+
322360
### Пример 2: Агент для анализа данных
323361

324362
```python

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,8 @@ dependencies = [
4646
"uvicorn>=0.35.0",
4747
"fastmcp>=2.12.4",
4848
"jambo>=0.1.3.post2",
49+
# Tools filtering
50+
"rank-bm25>=0.2.2",
4951
]
5052

5153
[project.urls]

sgr_agent_core/agent_definition.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,9 @@ class ExecutionConfig(BaseModel, extra="allow"):
146146
default="logs", description="Directory for saving bot logs. Set to None or empty string to disable logging."
147147
)
148148
reports_dir: str = Field(default="reports", description="Directory for saving reports")
149+
tools_filtering: bool = Field(
150+
default=False, description="Enable intelligent tools filtering based on user prompt using Regex and BM25"
151+
)
149152

150153

151154
class AgentConfig(BaseModel, extra="allow"):

sgr_agent_core/base_agent.py

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,15 @@
22
import json
33
import logging
44
import os
5+
import re
56
import traceback
67
import uuid
78
from datetime import datetime
89
from typing import Type
910

1011
from openai import AsyncOpenAI, pydantic_function_tool
1112
from openai.types.chat import ChatCompletionFunctionToolParam, ChatCompletionMessageParam
13+
from rank_bm25 import BM25Okapi
1214

1315
from sgr_agent_core.agent_definition import AgentConfig
1416
from sgr_agent_core.models import AgentContext, AgentStatesEnum
@@ -159,6 +161,78 @@ async def _prepare_context(self) -> list[dict]:
159161
*self.conversation,
160162
]
161163

164+
async def _filter_tools_by_prompt(self) -> list[Type[BaseTool]]:
165+
"""Filter tools based on user prompt using Regex and BM25 search.
166+
167+
System tools (isSystemTool=True) are always included.
168+
Other tools are filtered based on relevance to the user prompt.
169+
170+
Returns:
171+
List of filtered tool classes
172+
"""
173+
if not self.config.execution.tools_filtering:
174+
return list(self.toolkit)
175+
176+
# Extract user prompt text from task_messages
177+
prompt_text = ""
178+
for message in self.task_messages:
179+
if isinstance(message.get("content"), str):
180+
prompt_text += " " + message["content"]
181+
elif isinstance(message.get("content"), list):
182+
for content_item in message.get("content", []):
183+
if isinstance(content_item, dict) and content_item.get("type") == "text":
184+
prompt_text += " " + content_item.get("text", "")
185+
186+
prompt_text = prompt_text.strip().lower()
187+
188+
# Always include system tools
189+
system_tools = [tool for tool in self.toolkit if getattr(tool, "isSystemTool", False)]
190+
non_system_tools = [tool for tool in self.toolkit if not getattr(tool, "isSystemTool", False)]
191+
192+
if not prompt_text or not non_system_tools:
193+
return system_tools + non_system_tools
194+
195+
# Prepare documents for BM25: tool name + description
196+
tool_documents = []
197+
for tool in non_system_tools:
198+
tool_name = getattr(tool, "tool_name", tool.__name__).lower()
199+
tool_description = getattr(tool, "description", "").lower()
200+
doc_text = f"{tool_name} {tool_description}"
201+
tool_documents.append(doc_text)
202+
203+
# Tokenize documents for BM25
204+
tokenized_docs = [doc.split() for doc in tool_documents]
205+
bm25 = BM25Okapi(tokenized_docs)
206+
207+
# Tokenize query
208+
query_tokens = prompt_text.split()
209+
scores = bm25.get_scores(query_tokens)
210+
211+
# Regex matching: check if tool name or description matches prompt keywords
212+
regex_matches = []
213+
prompt_words = set(re.findall(r"\b\w+\b", prompt_text))
214+
for tool in non_system_tools:
215+
tool_name = getattr(tool, "tool_name", tool.__name__).lower()
216+
tool_description = getattr(tool, "description", "").lower()
217+
tool_words = set(re.findall(r"\b\w+\b", f"{tool_name} {tool_description}"))
218+
219+
# Check for keyword matches
220+
matches = prompt_words.intersection(tool_words)
221+
regex_matches.append(len(matches) > 0)
222+
223+
# Combine BM25 scores and regex matches
224+
# Tools are included if they have high BM25 score OR regex match
225+
filtered_non_system_tools = []
226+
for i, tool in enumerate(non_system_tools):
227+
bm25_score = scores[i]
228+
has_regex_match = regex_matches[i]
229+
230+
# Include tool if BM25 score is above threshold or has regex match
231+
if bm25_score > 0.1 or has_regex_match:
232+
filtered_non_system_tools.append(tool)
233+
234+
return system_tools + filtered_non_system_tools
235+
162236
async def _prepare_tools(self) -> list[ChatCompletionFunctionToolParam]:
163237
"""Prepare available tools for the current agent state and progress.
164238
@@ -168,9 +242,11 @@ async def _prepare_tools(self) -> list[ChatCompletionFunctionToolParam]:
168242
Returns a list of ChatCompletionFunctionToolParam based
169243
available tools.
170244
"""
171-
tools = set(self.toolkit)
172245
if self._context.iteration >= self.config.execution.max_iterations:
173246
raise RuntimeError("Max iterations reached")
247+
248+
# Apply tools filtering if enabled
249+
tools = await self._filter_tools_by_prompt()
174250
return [pydantic_function_tool(tool, name=tool.tool_name) for tool in tools]
175251

176252
async def _reasoning_phase(self) -> ReasoningTool:

tests/test_base_agent.py

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -595,3 +595,151 @@ def test_save_agent_log_creates_file_when_logs_dir_is_set(self, tmp_path):
595595
log_files = list(os.listdir(logs_dir))
596596
assert len(log_files) == 1
597597
assert log_files[0].endswith("-log.json")
598+
599+
600+
class TestBaseAgentToolsFiltering:
601+
"""Tests for tools filtering functionality in BaseAgent."""
602+
603+
@pytest.mark.asyncio
604+
async def test_filter_tools_disabled_by_default(self):
605+
"""Test that tools filtering is disabled by default."""
606+
from sgr_agent_core.tools import (
607+
ClarificationTool,
608+
ReasoningTool,
609+
WebSearchTool,
610+
)
611+
612+
agent = create_test_agent(
613+
BaseAgent,
614+
task_messages=[{"role": "user", "content": "Search for information about Python"}],
615+
toolkit=[ReasoningTool, WebSearchTool, ClarificationTool],
616+
)
617+
618+
filtered_tools = await agent._filter_tools_by_prompt()
619+
assert len(filtered_tools) == 3
620+
assert ReasoningTool in filtered_tools
621+
assert WebSearchTool in filtered_tools
622+
assert ClarificationTool in filtered_tools
623+
624+
@pytest.mark.asyncio
625+
async def test_filter_tools_always_includes_system_tools(self):
626+
"""Test that system tools (isSystemTool=True) are always included."""
627+
from sgr_agent_core.agent_definition import ExecutionConfig
628+
from sgr_agent_core.tools import (
629+
ClarificationTool,
630+
ReasoningTool,
631+
WebSearchTool,
632+
)
633+
634+
execution_config = ExecutionConfig(tools_filtering=True)
635+
agent = create_test_agent(
636+
BaseAgent,
637+
task_messages=[{"role": "user", "content": "Random unrelated text"}],
638+
execution_config=execution_config,
639+
toolkit=[ReasoningTool, WebSearchTool, ClarificationTool],
640+
)
641+
642+
filtered_tools = await agent._filter_tools_by_prompt()
643+
# ReasoningTool and ClarificationTool have isSystemTool=True, so they should always be included
644+
assert ReasoningTool in filtered_tools
645+
assert ClarificationTool in filtered_tools
646+
647+
@pytest.mark.asyncio
648+
async def test_filter_tools_filters_by_relevance(self):
649+
"""Test that tools are filtered based on prompt relevance using BM25
650+
and regex."""
651+
from sgr_agent_core.agent_definition import ExecutionConfig
652+
from sgr_agent_core.tools import (
653+
ClarificationTool,
654+
ReasoningTool,
655+
WebSearchTool,
656+
)
657+
658+
execution_config = ExecutionConfig(tools_filtering=True)
659+
agent = create_test_agent(
660+
BaseAgent,
661+
task_messages=[{"role": "user", "content": "Search the web for information about machine learning"}],
662+
execution_config=execution_config,
663+
toolkit=[ReasoningTool, WebSearchTool, ClarificationTool],
664+
)
665+
666+
filtered_tools = await agent._filter_tools_by_prompt()
667+
# System tools should always be included
668+
assert ReasoningTool in filtered_tools
669+
assert ClarificationTool in filtered_tools
670+
# WebSearchTool should be included because prompt mentions "search"
671+
assert WebSearchTool in filtered_tools
672+
673+
@pytest.mark.asyncio
674+
async def test_filter_tools_with_multiple_non_system_tools(self):
675+
"""Test filtering when there are multiple non-system tools."""
676+
from sgr_agent_core.agent_definition import ExecutionConfig
677+
from sgr_agent_core.tools import (
678+
ClarificationTool,
679+
ExtractPageContentTool,
680+
ReasoningTool,
681+
WebSearchTool,
682+
)
683+
684+
execution_config = ExecutionConfig(tools_filtering=True)
685+
agent = create_test_agent(
686+
BaseAgent,
687+
task_messages=[{"role": "user", "content": "Extract content from web pages"}],
688+
execution_config=execution_config,
689+
toolkit=[ReasoningTool, WebSearchTool, ExtractPageContentTool, ClarificationTool],
690+
)
691+
692+
filtered_tools = await agent._filter_tools_by_prompt()
693+
# System tools should always be included
694+
assert ReasoningTool in filtered_tools
695+
assert ClarificationTool in filtered_tools
696+
# ExtractPageContentTool should be included because prompt mentions "extract"
697+
assert ExtractPageContentTool in filtered_tools
698+
699+
@pytest.mark.asyncio
700+
async def test_filter_tools_empty_prompt(self):
701+
"""Test filtering with empty prompt."""
702+
from sgr_agent_core.agent_definition import ExecutionConfig
703+
from sgr_agent_core.tools import (
704+
ClarificationTool,
705+
ReasoningTool,
706+
WebSearchTool,
707+
)
708+
709+
execution_config = ExecutionConfig(tools_filtering=True)
710+
agent = create_test_agent(
711+
BaseAgent,
712+
task_messages=[{"role": "user", "content": ""}],
713+
execution_config=execution_config,
714+
toolkit=[ReasoningTool, WebSearchTool, ClarificationTool],
715+
)
716+
717+
filtered_tools = await agent._filter_tools_by_prompt()
718+
# Only system tools should be included when prompt is empty
719+
assert ReasoningTool in filtered_tools
720+
assert ClarificationTool in filtered_tools
721+
# WebSearchTool might not be included if prompt is empty
722+
# This depends on implementation, but system tools must be included
723+
724+
@pytest.mark.asyncio
725+
async def test_prepare_tools_uses_filtering_when_enabled(self):
726+
"""Test that _prepare_tools uses filtering when tools_filtering is
727+
enabled."""
728+
from sgr_agent_core.agent_definition import ExecutionConfig
729+
from sgr_agent_core.tools import (
730+
ClarificationTool,
731+
ReasoningTool,
732+
WebSearchTool,
733+
)
734+
735+
execution_config = ExecutionConfig(tools_filtering=True)
736+
agent = create_test_agent(
737+
BaseAgent,
738+
task_messages=[{"role": "user", "content": "Search for Python tutorials"}],
739+
execution_config=execution_config,
740+
toolkit=[ReasoningTool, WebSearchTool, ClarificationTool],
741+
)
742+
743+
tools = await agent._prepare_tools()
744+
# Should return filtered tools converted to ChatCompletionFunctionToolParam
745+
assert len(tools) >= 2 # At least system tools should be present

0 commit comments

Comments
 (0)