Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 65 additions & 28 deletions src/praisonai-agents/praisonaiagents/agent/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,11 +169,70 @@ def _is_file_path(value: str) -> bool:
# Applied even when context management is disabled to prevent runaway tool outputs
DEFAULT_TOOL_OUTPUT_LIMIT = 16000

# Global variables for API server (protected by _server_lock for thread safety)
_server_lock = threading.Lock()
_server_started = {} # Dict of port -> started boolean
_registered_agents = {} # Dict of port -> Dict of path -> agent_id
_shared_apps = {} # Dict of port -> FastAPI app
class ServerRegistry:
"""Registry for API server state per-port."""

def __init__(self):
self._lock = threading.Lock()
self._server_started = {} # Dict of port -> started boolean
self._registered_agents = {} # Dict of port -> Dict of path -> agent_id
self._shared_apps = {} # Dict of port -> FastAPI app

@staticmethod
def get_default_instance():
"""Get default global registry for backward compatibility."""
if not hasattr(ServerRegistry, '_default_instance'):
import threading
# Double-checked locking pattern for thread safety
with threading.Lock():
if not hasattr(ServerRegistry, '_default_instance'):
ServerRegistry._default_instance = ServerRegistry()
return ServerRegistry._default_instance
Comment on lines +182 to +190
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The lazy initialization of the default ServerRegistry instance is not thread-safe. In a multi-agent environment where multiple agents might be initialized concurrently, this could lead to a race condition where multiple registry instances are created, potentially causing registration data to be lost or inconsistent. Since this PR specifically targets 'multi-agent safety', it is recommended to use a thread-safe singleton pattern or initialize the default instance at the module level.


def is_server_started(self, port: int) -> bool:
with self._lock:
return self._server_started.get(port, False)

def set_server_started(self, port: int, started: bool) -> None:
with self._lock:
self._server_started[port] = started

def get_shared_app(self, port: int):
with self._lock:
return self._shared_apps.get(port)

def set_shared_app(self, port: int, app) -> None:
with self._lock:
self._shared_apps[port] = app

def register_agent(self, port: int, path: str, agent_id: str) -> None:
with self._lock:
if port not in self._registered_agents:
self._registered_agents[port] = {}
self._registered_agents[port][path] = agent_id

def get_registered_agents(self, port: int) -> dict:
with self._lock:
return self._registered_agents.get(port, {}).copy()

def cleanup_agent_registrations(self, agent_id: str) -> None:
"""Remove all registrations for an agent ID and clean empty port state."""
with self._lock:
Comment on lines +172 to +220
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes the module-level server globals, but there are still references to _server_lock, _registered_agents, and _server_started later in agent.py (e.g., _cleanup_server_registrations) and in execution_mixin.py. As-is, those paths will raise NameError. Either update those code paths to use ServerRegistry (via _get_default_server_registry() or an injected instance), or provide backward-compatible module-level aliases that delegate to the default registry instance.

Copilot uses AI. Check for mistakes.
ports_to_clean = []
for port, path_dict in self._registered_agents.items():
paths_to_remove = [path for path, registered_id in path_dict.items() if registered_id == agent_id]
for path in paths_to_remove:
del path_dict[path]
if not path_dict:
ports_to_clean.append(port)

for port in ports_to_clean:
self._registered_agents.pop(port, None)
self._server_started.pop(port, None)
Comment on lines +218 to +231
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 ServerRegistry disconnected from execution_mixin.py's server state

The PR's cleanup path (called from __del__) now calls _get_default_server_registry().cleanup_agent_registrations(self._agent_id). However, the actual HTTP server management code lives in execution_mixin.py::_launch_http_server, which uses module-level globals declared via:

global _server_started, _registered_agents, _shared_apps, _server_lock  # line 1033

Those names refer to execution_mixin.py's module namespace, not to agent.py's namespace (Python functions retain __globals__ pointing to the module where they are defined). None of those four globals are defined in execution_mixin.py, so _launch_http_server raises a NameError when called.

The ServerRegistry introduced here holds completely separate state — so even if the NameError were fixed, __del__-time cleanup would still operate on a registry that was never populated by _launch_http_server. The PR's claim of fixing "multi-agent unsafe singletons" is incomplete since the actual server-registration path is neither fixed nor connected to the new ServerRegistry.


# Backward compatibility - use default instance
def _get_default_server_registry() -> ServerRegistry:
return ServerRegistry.get_default_instance()

# Don't import FastAPI dependencies here - use lazy loading instead

Expand Down Expand Up @@ -4571,28 +4630,7 @@ def _cleanup_server_registrations(self) -> None:
return # No ID generated, nothing registered

try:
agent_id = self._agent_id
with _server_lock:
# Remove from _registered_agents
ports_to_clean = []
for port, path_dict in _registered_agents.items():
paths_to_remove = []
for path, registered_id in path_dict.items():
if registered_id == agent_id:
paths_to_remove.append(path)

for path in paths_to_remove:
del path_dict[path]

# If no paths left for this port, mark port for cleanup
if not path_dict:
ports_to_clean.append(port)

# Clean up empty port entries
for port in ports_to_clean:
_registered_agents.pop(port, None)
_server_started.pop(port, None)
# Note: We don't clean up _shared_apps here as other agents might be using them
_get_default_server_registry().cleanup_agent_registrations(self._agent_id)

except Exception as e:
import sys
Expand Down Expand Up @@ -4629,4 +4667,3 @@ def is_closed(self) -> bool:

def __str__(self):
return f"Agent(name='{self.name}', role='{self.role}', goal='{self.goal}')"

73 changes: 24 additions & 49 deletions src/praisonai-agents/praisonaiagents/agent/chat_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -536,9 +536,9 @@ def _chat_completion(self, messages, temperature=1.0, tools=None, stream=True, r
formatted_tools = self._format_tools_for_completion(tools)

try:
# NEW: Unified protocol dispatch path (Issue #1304)
# Check if unified dispatch is enabled (opt-in for backward compatibility)
if getattr(self, '_use_unified_llm_dispatch', False):
# NEW: Unified protocol dispatch path (Issue #1304, #1362)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unified dispatch silently enabled for all custom-LLM users

Changing the _use_unified_llm_dispatch default from False to True is a silent, potentially breaking behavior change for any caller that relies on the llm_instance.get_response(...) path.

Before this PR: agents with a custom llm_instance would always fall through to self.llm_instance.get_response(...) unless _use_unified_llm_dispatch was explicitly set to True. After this PR: those same agents now route through self._execute_unified_chat_completion(...) by default, which may not call get_response at all.

Anyone who set a custom llm_instance expecting its get_response method to be called will see different (and potentially broken) behavior without any config change or deprecation notice. The PR description calls this "backward compatible" but this specific change is not backward compatible for the custom-LLM path.

If the intent is to unify all paths, consider at minimum:

  • Adding a deprecation warning when falling back from the unified path to get_response
  • Or keeping False as the default and documenting how users can opt in

# Enable unified dispatch by default for DRY and feature parity
if getattr(self, '_use_unified_llm_dispatch', True):
# Use composition instead of runtime class mutation for safety
Comment on lines +539 to 542
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check test expectations for unified dispatch default
rg -n '_use_unified_llm_dispatch' --type py -C 3

Repository: MervinPraison/PraisonAI

Length of output: 3326


🏁 Script executed:

#!/bin/bash
# Check if test file has been updated or if there are skip markers
cat -n src/praisonai-agents/test_unified_llm.py | head -50

Repository: MervinPraison/PraisonAI

Length of output: 2420


🏁 Script executed:

#!/bin/bash
# Look for CHANGELOG, migration docs, or PR notes
find . -type f \( -name "CHANGELOG*" -o -name "MIGRATION*" -o -name "*.md" \) | head -20 | xargs rg -l "unified.*dispatch" 2>/dev/null || echo "No migration docs found"

Repository: MervinPraison/PraisonAI

Length of output: 90


🏁 Script executed:

#!/bin/bash
# Check if there are any comments in chat_mixin.py explaining this change
sed -n '535,550p' src/praisonai-agents/praisonaiagents/agent/chat_mixin.py
sed -n '1835,1850p' src/praisonai-agents/praisonaiagents/agent/chat_mixin.py

Repository: MervinPraison/PraisonAI

Length of output: 1895


Unified dispatch default change breaks backward compatibility test—requires migration or revert.

Both sync (line 541) and async (line 1839) now default _use_unified_llm_dispatch to True, achieving sync/async parity. However, this introduces a breaking change: the backward compatibility test at src/praisonai-agents/test_unified_llm.py:36 explicitly expects the default to be False:

assert getattr(agent, '_use_unified_llm_dispatch', False) == False

This test will fail with the current code. Additionally, no migration documentation or CHANGELOG entry exists for this behavioral shift. Either:

  1. Revert the default to False and explicitly enable unified dispatch only when opted in, or
  2. Update the test and add migration documentation explaining the default behavior change and providing guidance for users who relied on the old behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/praisonai-agents/praisonaiagents/agent/chat_mixin.py` around lines 539 -
542, The change sets the attribute _use_unified_llm_dispatch to True by default
(in chat_mixin.py sync path and async path), which breaks an existing
backward-compatibility test; either revert the default to False in both the sync
and async dispatch paths (ensure getattr(self, '_use_unified_llm_dispatch',
False) remains False unless explicitly enabled) or update the test to expect
True and add a migration note/CHANGELOG entry explaining the new default and how
to opt out (reference the attribute _use_unified_llm_dispatch and the unified
dispatch dispatch code paths in chat_mixin.py to make the change consistently in
both sync and async branches).

final_response = self._execute_unified_chat_completion(
messages=messages,
Expand Down Expand Up @@ -579,50 +579,25 @@ def _chat_completion(self, messages, temperature=1.0, tools=None, stream=True, r
reasoning_steps=reasoning_steps
)
else:
# Non-streaming with custom LLM - don't show streaming-like behavior
if False: # Don't use display_generating when stream=False to avoid streaming-like behavior
# This block is disabled to maintain consistency with the OpenAI path fix
with _get_live()(
_get_display_functions()['display_generating']("", start_time),
console=self.console,
refresh_per_second=4,
) as live:
final_response = self.llm_instance.get_response(
prompt=messages[1:],
system_prompt=messages[0]['content'] if messages and messages[0]['role'] == 'system' else None,
temperature=temperature,
tools=formatted_tools if formatted_tools else None,
verbose=self.verbose,
markdown=self.markdown,
stream=stream,
console=self.console,
execute_tool_fn=self.execute_tool,
agent_name=self.name,
agent_role=self.role,
agent_tools=[getattr(t, '__name__', str(t)) for t in self.tools] if self.tools else None,
task_name=task_name,
task_description=task_description,
task_id=task_id,
reasoning_steps=reasoning_steps
)
else:
final_response = self.llm_instance.get_response(
prompt=messages[1:],
system_prompt=messages[0]['content'] if messages and messages[0]['role'] == 'system' else None,
temperature=temperature,
tools=formatted_tools if formatted_tools else None,
verbose=self.verbose,
markdown=self.markdown,
stream=stream,
console=self.console,
execute_tool_fn=self.execute_tool,
agent_name=self.name,
agent_role=self.role,
agent_tools=[getattr(t, '__name__', str(t)) for t in self.tools] if self.tools else None,
task_name=task_name,
task_description=task_description,
task_id=task_id,
reasoning_steps=reasoning_steps
# Non-streaming with custom LLM - direct execution
has_system = bool(messages and messages[0].get('role') == 'system')
final_response = self.llm_instance.get_response(
prompt=messages[1:] if has_system else messages,
system_prompt=messages[0]['content'] if has_system else None,
temperature=temperature,
tools=formatted_tools if formatted_tools else None,
verbose=self.verbose,
markdown=self.markdown,
stream=stream,
console=self.console,
execute_tool_fn=self.execute_tool,
agent_name=self.name,
agent_role=self.role,
agent_tools=[getattr(t, '__name__', str(t)) for t in self.tools] if self.tools else None,
task_name=task_name,
task_description=task_description,
task_id=task_id,
reasoning_steps=reasoning_steps
)
else:
# Use the standard OpenAI client approach with tool support
Expand Down Expand Up @@ -1860,8 +1835,8 @@ async def _achat_impl(self, prompt, temperature, tools, output_json, output_pyda
formatted_tools = self._format_tools_for_completion(tools)

# NEW: Unified protocol dispatch path (Issue #1304) - Async version
# Check if unified dispatch is enabled (opt-in for backward compatibility)
if getattr(self, '_use_unified_llm_dispatch', False):
# Enable unified dispatch by default for DRY and feature parity (sync/async consistent)
if getattr(self, '_use_unified_llm_dispatch', True):
# Use composition instead of runtime class mutation for safety
response = await self._execute_unified_achat_completion(
messages=messages,
Expand Down
23 changes: 16 additions & 7 deletions src/praisonai-agents/praisonaiagents/knowledge/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,13 +101,22 @@ def stats(self) -> IndexStats:
class IndexRegistry:
"""Registry for index types."""

_instance: Optional["IndexRegistry"] = None
def __init__(self):
"""Initialize a new registry instance."""
self._indices: Dict[str, Callable[..., IndexProtocol]] = {}

def __new__(cls) -> "IndexRegistry":
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._indices: Dict[str, Callable[..., IndexProtocol]] = {}
return cls._instance
@classmethod
def default(cls) -> "IndexRegistry":
"""Get a default global registry instance for convenience."""
if not hasattr(cls, '_default_instance'):
import threading
# Use lock for thread safety
if not hasattr(cls, '_init_lock'):
cls._init_lock = threading.Lock()
with cls._init_lock:
if not hasattr(cls, '_default_instance'):
cls._default_instance = cls()
Comment on lines +111 to +118
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Race condition in lock initialization

The _init_lock is created inside the outer hasattr guard but before it is acquired, so two threads can both enter the outer block, both find _init_lock absent, and each assign a different Lock object to cls._init_lock. If Thread A evaluates with cls._init_lock: after writing Lock_A but before Thread B overwrites it with Lock_B, the two threads end up holding different locks and both proceed to create _default_instance simultaneously.

This is the same TOCTOU pattern seen in ServerRegistry.get_default_instance() and get_default_memory_registry(), but here the lock is persistent (class-level) rather than thrown away each call. A safe pattern initialises the lock at class definition time:

class IndexRegistry:
    _init_lock: threading.Lock = threading.Lock()  # module-load time, no race

    @classmethod
    def default(cls) -> "IndexRegistry":
        if not hasattr(cls, '_default_instance'):
            with cls._init_lock:
                if not hasattr(cls, '_default_instance'):
                    cls._default_instance = cls()
        return cls._default_instance

return cls._default_instance
Comment on lines +109 to +119
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default class method in IndexRegistry is not thread-safe. Concurrent access during the first call could result in multiple instances of the registry being created. Consider using a lock or module-level initialization for the default instance to ensure consistency in multi-agent scenarios.

Comment on lines +108 to +119
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexRegistry is no longer a singleton (removed __new__), but get_index_registry() elsewhere in this module still returns IndexRegistry() which creates a fresh, empty registry each call. That means the default registrations done at import time won’t be visible to later callers. To preserve the previous behavior, get_index_registry() should return IndexRegistry.default() (or reintroduce singleton semantics).

Copilot uses AI. Check for mistakes.
Comment on lines +108 to +119
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same race condition—define the lock at class level.

The lazy _init_lock creation suffers from the same race as the memory registry: two threads can create separate locks. For a classmethod, define the lock as a class attribute at class definition time.

🔒 Thread-safe classmethod singleton
 class IndexRegistry:
     """Registry for index types."""
+    _init_lock: "threading.Lock"  # defined below after import
     
     def __init__(self):
         """Initialize a new registry instance."""
         self._indices: Dict[str, Callable[..., IndexProtocol]] = {}
     
     `@classmethod`
     def default(cls) -> "IndexRegistry":
         """Get a default global registry instance for convenience."""
-        if not hasattr(cls, '_default_instance'):
-            import threading
-            # Use lock for thread safety 
-            if not hasattr(cls, '_init_lock'):
-                cls._init_lock = threading.Lock()
-            with cls._init_lock:
-                if not hasattr(cls, '_default_instance'):
-                    cls._default_instance = cls()
-        return cls._default_instance
+        if not hasattr(cls, '_default_instance'):
+            with cls._init_lock:
+                if not hasattr(cls, '_default_instance'):
+                    cls._default_instance = cls()
+        return cls._default_instance

Then after the class definition, initialize the lock:

import threading
IndexRegistry._init_lock = threading.Lock()

Alternatively, use a module-level @lru_cache helper that default() delegates to—consistent with the memory registry pattern if you adopt that fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/praisonai-agents/praisonaiagents/knowledge/index.py` around lines 108 -
119, The lazy creation of _init_lock in IndexRegistry.default() can race;
instead define a single class-level lock and use it in default(): add a class
attribute _init_lock initialized to a threading.Lock() at class definition time
(or immediately after the class is defined) so IndexRegistry.default() only
acquires the pre-existing _init_lock before checking/creating _default_instance;
update references in default() to rely on the class attribute and remove the
lazy hasattr-based lock creation.


def register(self, name: str, factory: Callable[..., IndexProtocol]) -> None:
"""Register an index factory."""
Expand All @@ -133,7 +142,7 @@ def clear(self) -> None:

def get_index_registry() -> IndexRegistry:
"""Get the global index registry instance."""
return IndexRegistry()
return IndexRegistry.default()

class KeywordIndex:
"""
Expand Down
62 changes: 58 additions & 4 deletions src/praisonai-agents/praisonaiagents/llm/adapters/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,25 @@ def handle_empty_response_with_tools(self, state: Dict[str, Any]) -> bool:

def get_default_settings(self) -> Dict[str, Any]:
return {} # No provider-specific defaults

def parse_tool_calls(self, raw_response: Dict[str, Any]) -> Optional[List[Dict[str, Any]]]:
"""Default tool call parsing - use OpenAI-style format."""
if "choices" in raw_response and len(raw_response["choices"]) > 0:
message = raw_response["choices"][0].get("message", {})
return message.get("tool_calls")
return None

def should_skip_streaming_with_tools(self) -> bool:
return False # Most providers support streaming with tools

def recover_tool_calls_from_text(self, response_text: str, tools: List[Dict[str, Any]]) -> Optional[List[Dict[str, Any]]]:
return None # No text recovery by default

def inject_cache_control(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
return messages # No cache control by default

def extract_reasoning_tokens(self, response: Dict[str, Any]) -> int:
return 0 # No reasoning tokens by default


class OllamaAdapter(DefaultAdapter):
Expand Down Expand Up @@ -99,6 +118,37 @@ def handle_empty_response_with_tools(self, state: Dict[str, Any]) -> bool:
return True # Signal that special handling is needed
return False

def recover_tool_calls_from_text(self, response_text: str, tools: List[Dict[str, Any]]) -> Optional[List[Dict[str, Any]]]:
"""Ollama-specific tool call recovery from response text."""
if not response_text or not tools:
return None

try:
import json
response_json = json.loads(response_text.strip())

# Normalize to list so both single and multi-tool payloads are supported
if isinstance(response_json, dict):
response_json = [response_json]

if isinstance(response_json, list):
tool_calls: List[Dict[str, Any]] = []
for idx, tool_json in enumerate(response_json):
if isinstance(tool_json, dict) and "name" in tool_json:
tool_calls.append({
"id": f"call_{tool_json['name']}_{idx}_{hash(response_text) % 10000}",
"type": "function",
"function": {
"name": tool_json["name"],
"arguments": json.dumps(tool_json.get("arguments", {}))
}
})
return tool_calls if tool_calls else None
except (json.JSONDecodeError, TypeError, KeyError):
pass

return None
Comment on lines +121 to +150
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using hash(response_text) % 10000 to generate a tool call ID is problematic. Python's hash() is randomized across processes by default, which can lead to inconsistent IDs if state is persisted and resumed. Additionally, the small modulo space (10,000) increases the risk of ID collisions. Using a UUID or a unique counter would be more robust for generating these identifiers.

    def recover_tool_calls_from_text(self, response_text: str, tools: List[Dict[str, Any]]) -> Optional[List[Dict[str, Any]]]:
        """Ollama-specific tool call recovery from response text."""
        if not response_text or not tools:
            return None
        
        try:
            import json
            import uuid
            response_json = json.loads(response_text.strip())
            if isinstance(response_json, dict) and "name" in response_json:
                # Convert Ollama format to standard tool_calls format
                return [{
                    "id": f"call_{response_json['name']}_{uuid.uuid4().hex[:8]}",
                    "type": "function",
                    "function": {
                        "name": response_json["name"],
                        "arguments": json.dumps(response_json.get("arguments", {}))
                    }
                }]
        except (json.JSONDecodeError, TypeError, KeyError):
            pass
        
        return None


def post_tool_iteration(self, state: Dict[str, Any]) -> None:
# Replaces: Ollama-specific post-tool summary branches
if (not state.get('response_text', '').strip() and
Expand Down Expand Up @@ -142,6 +192,13 @@ class GeminiAdapter(DefaultAdapter):
- Supports structured output
"""

def should_skip_streaming_with_tools(self) -> bool:
"""Gemini should skip streaming when tools are present."""
return True

def supports_structured_output(self) -> bool:
return True

Comment on lines +195 to +201
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeminiAdapter defines supports_structured_output() twice (once near the top of the class and again later). This duplication is easy to miss and can cause confusion about which implementation is intended. Remove the duplicate method and keep a single definition.

Copilot uses AI. Check for mistakes.
def format_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
# Replaces: gemini_internal_tools handling in llm.py
# Internal tool names match GEMINI_INTERNAL_TOOLS: {'googleSearch', 'urlContext', 'codeExecution'}
Expand All @@ -160,9 +217,6 @@ def format_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
def supports_streaming_with_tools(self) -> bool:
# Gemini has issues with streaming + tools
return False

def supports_structured_output(self) -> bool:
return True


# Provider adapter registry - public for extension
Expand Down Expand Up @@ -236,4 +290,4 @@ def has_provider_adapter(name: str) -> bool:
'add_provider_adapter',
'list_provider_adapters',
'has_provider_adapter',
]
]
20 changes: 20 additions & 0 deletions src/praisonai-agents/praisonaiagents/llm/protocols.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,26 @@ def handle_empty_response_with_tools(self, state: Dict[str, Any]) -> bool:
def get_default_settings(self) -> Dict[str, Any]:
"""Get provider-specific default settings."""
...

def parse_tool_calls(self, raw_response: Dict[str, Any]) -> Optional[List[Dict[str, Any]]]:
"""Parse tool calls from provider-specific response format."""
...

def should_skip_streaming_with_tools(self) -> bool:
"""Check if provider should skip streaming when tools are present."""
...

def recover_tool_calls_from_text(self, response_text: str, tools: List[Dict[str, Any]]) -> Optional[List[Dict[str, Any]]]:
"""Attempt to recover tool calls from response text for providers that don't format them properly."""
...

def inject_cache_control(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Inject provider-specific cache control headers."""
...

def extract_reasoning_tokens(self, response: Dict[str, Any]) -> int:
"""Extract reasoning token count from provider-specific response."""
...
Comment on lines +330 to +348
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify call sites for new adapter protocol hooks in core LLM paths.
rg -nP --type=py '\._provider_adapter\.(parse_tool_calls|should_skip_streaming_with_tools|recover_tool_calls_from_text|inject_cache_control|extract_reasoning_tokens)\s*\(' src/praisonai-agents/praisonaiagents/llm

# Broader scan for direct hook invocations (non _provider_adapter variable names)
rg -nP --type=py '\b(parse_tool_calls|should_skip_streaming_with_tools|recover_tool_calls_from_text|inject_cache_control|extract_reasoning_tokens)\s*\(' src/praisonai-agents/praisonaiagents/llm

Repository: MervinPraison/PraisonAI

Length of output: 1884


🏁 Script executed:

# Check the llm.py file structure and find tool/stream/cache related code
wc -l src/praisonai-agents/praisonaiagents/llm/llm.py

Repository: MervinPraison/PraisonAI

Length of output: 119


🏁 Script executed:

# Search for tool call parsing logic
rg -n "parse_tool|tool_call" src/praisonai-agents/praisonaiagents/llm/llm.py

Repository: MervinPraison/PraisonAI

Length of output: 16571


🏁 Script executed:

# Search for streaming decision logic
rg -n "stream|skip" src/praisonai-agents/praisonaiagents/llm/llm.py | head -20

Repository: MervinPraison/PraisonAI

Length of output: 1619


🏁 Script executed:

# Search for cache control logic
rg -n "cache|inject" src/praisonai-agents/praisonaiagents/llm/llm.py

Repository: MervinPraison/PraisonAI

Length of output: 1234


🏁 Script executed:

# Search for reasoning token extraction
rg -n "reasoning|token" src/praisonai-agents/praisonaiagents/llm/llm.py

Repository: MervinPraison/PraisonAI

Length of output: 15344


🏁 Script executed:

# Check what methods are actually called on _provider_adapter in llm.py
rg -n "_provider_adapter\." src/praisonai-agents/praisonaiagents/llm/llm.py

Repository: MervinPraison/PraisonAI

Length of output: 783


🏁 Script executed:

# Check the adapter implementations for these new methods
cat -n src/praisonai-agents/praisonaiagents/llm/adapters/__init__.py | sed -n '60,85p'

Repository: MervinPraison/PraisonAI

Length of output: 1345


These new adapter hooks are not wired into runtime dispatch—they create dead API surface.

The five protocol methods (parse_tool_calls, should_skip_streaming_with_tools, recover_tool_calls_from_text, inject_cache_control, extract_reasoning_tokens) are defined in the protocol and have base implementations in DefaultAdapter, but searching the LLM dispatch paths finds zero invocations of any of them. Instead, duplicate logic is implemented inline:

  • Tool call parsing uses _try_parse_tool_call_json() (line 1102) rather than the adapter hook
  • Cache control is hardcoded for Anthropic (lines 1440–1447) rather than delegated to inject_cache_control()
  • Reasoning token extraction reads the response directly (lines 4297–4303) rather than calling extract_reasoning_tokens()
  • Text-based recovery has no adapter call path

This violates the DRY principle and prevents providers from actually hooking these extension points. Wire these methods into their corresponding call sites in the chat/stream paths, or remove them if they're not needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/praisonai-agents/praisonaiagents/llm/protocols.py` around lines 330 -
348, The protocol hooks (parse_tool_calls, should_skip_streaming_with_tools,
recover_tool_calls_from_text, inject_cache_control, extract_reasoning_tokens)
are unused—replace the inline/duplicated logic with calls into the adapter
methods: where _try_parse_tool_call_json() is used (e.g., tool parsing at or
around _try_parse_tool_call_json), call adapter.parse_tool_calls(raw_response)
and if that returns None fall back to
adapter.recover_tool_calls_from_text(response_text, tools); replace the
Anthropic-specific cache control code with
adapter.inject_cache_control(messages) so providers can override cache headers;
use adapter.should_skip_streaming_with_tools() to decide streaming when tools
are present; and replace direct reasoning-token reads with
adapter.extract_reasoning_tokens(response) so providers control token
accounting; keep DefaultAdapter implementations as fallbacks.



@runtime_checkable
Expand Down
Loading