Skip to content

Three high-impact gaps in src/praisonai/praisonai: sandbox path traversal, sync/async generator divergence, rate-limiter serialization + leakΒ #1869

@MervinPraison

Description

@MervinPraison

Summary

In-depth analysis of src/praisonai/praisonai/ (the wrapper layer) surfaced three high-impact, validated gaps. Each has been confirmed by reading the actual source β€” file paths and line numbers are exact, and the snippets below are copied from the tree as it currently stands.

Scope is intentionally narrow: not docs, not tests, not file sizes. These are correctness / safety / production-reliability issues sitting on hot paths (sandbox file API, multi-agent generation, bot send loop).


1. Sandbox file APIs (SubprocessSandbox, DockerSandbox) are vulnerable to path traversal

Files:

  • src/praisonai/praisonai/sandbox/subprocess.py β€” write_file (L269-288), read_file (L290-310), list_files (L312-336)
  • src/praisonai/praisonai/sandbox/docker.py β€” same three methods (L340-407)

The bug

Both backends compute the host-side full path with only path.lstrip("/") and join β€” no canonicalization, no containment check. lstrip("/") strips leading slashes; it does not neutralize ../ sequences.

# src/praisonai/praisonai/sandbox/subprocess.py:290-310
async def read_file(self, path: str) -> Optional[Union[str, bytes]]:
    if not self._temp_dir:
        return None
    full_path = os.path.join(self._temp_dir, path.lstrip("/"))
    if not os.path.exists(full_path):
        return None
    try:
        with open(full_path, "r") as f:
            return f.read()
    except UnicodeDecodeError:
        with open(full_path, "rb") as f:
            return f.read()
    except Exception:
        return None
# src/praisonai/praisonai/sandbox/subprocess.py:269-288
async def write_file(self, path: str, content: Union[str, bytes]) -> bool:
    if not self._temp_dir:
        return False
    full_path = os.path.join(self._temp_dir, path.lstrip("/"))
    os.makedirs(os.path.dirname(full_path), exist_ok=True)
    ...

Reproduction (the math, not a payload):

os.path.realpath(os.path.join("/tmp/praisonai_sandbox_xxx",
                              "../../../etc/passwd".lstrip("/")))
# -> "/etc/passwd"

DockerSandbox is doubly bad: the documented intent is "code execution isolation", but the file APIs operate on the host filesystem (tempfile.mkdtemp), not on the container's FS β€” so the traversal escapes onto the host, not into the container.

The sister backend sandbox/sandlock.py already has the correct guard, _safe_sandbox_path (L167-184):

# src/praisonai/praisonai/sandbox/sandlock.py:167-184
def _safe_sandbox_path(self, path: str) -> Optional[str]:
    if not self._temp_dir:
        return None
    candidate = os.path.realpath(
        os.path.join(self._temp_dir, path.lstrip("/"))
    )
    sandbox_root = os.path.realpath(self._temp_dir)
    if not candidate.startswith(sandbox_root + os.sep):
        logger.warning("Path traversal attempt blocked: %s", path)
        return None
    return candidate

subprocess.py and docker.py just don't call any equivalent β€” confirmed by grep over both files.

Why it matters

The whole point of SubprocessSandbox / DockerSandbox is to safely run LLM-generated code and tool calls. The LLM chooses the path argument when it calls read_file / write_file / list_files. A prompt-injected agent can read /etc/passwd, ~/.ssh/id_rsa, ~/.aws/credentials, the project's .env, or overwrite arbitrary files the process can write to β€” completely outside the "sandbox" boundary.

Fix

Promote _safe_sandbox_path to a shared helper in sandbox/_compat.py (or _shell.py) and use it in every backend.

# src/praisonai/praisonai/sandbox/_compat.py
import os, logging
logger = logging.getLogger(__name__)

def safe_sandbox_path(temp_dir: str | None, path: str) -> str | None:
    if not temp_dir:
        return None
    candidate = os.path.realpath(os.path.join(temp_dir, path.lstrip("/")))
    sandbox_root = os.path.realpath(temp_dir)
    if not (candidate == sandbox_root or candidate.startswith(sandbox_root + os.sep)):
        logger.warning("Path traversal attempt blocked: %s", path)
        return None
    return candidate
# src/praisonai/praisonai/sandbox/subprocess.py
from ._compat import safe_sandbox_path

async def read_file(self, path: str) -> Optional[Union[str, bytes]]:
    full_path = safe_sandbox_path(self._temp_dir, path)
    if full_path is None or not os.path.exists(full_path):
        return None
    ...

async def write_file(self, path: str, content: Union[str, bytes]) -> bool:
    full_path = safe_sandbox_path(self._temp_dir, path)
    if full_path is None:
        return False
    os.makedirs(os.path.dirname(full_path), exist_ok=True)
    ...

async def list_files(self, path: str = "/") -> list[str]:
    full_path = safe_sandbox_path(self._temp_dir, path)
    if full_path is None or not os.path.exists(full_path):
        return []
    ...

Same change in docker.py. Add a regression test that calls read_file("../../../etc/passwd") and asserts None.


2. AgentsGenerator sync and async paths run different setup logic β€” silent divergence

Files:

  • src/praisonai/praisonai/agents_generator.py β€” generate_crew_and_kickoff (sync, L472-626)
  • src/praisonai/praisonai/agents_generator.py β€” _arun_framework (async, L660-795), dispatched from agenerate_crew_and_kickoff

The bug

Both methods independently reimplement ~150 lines of YAML normalization, tool resolution, framework adapter selection, and observability bootstrap. They are not just duplicated β€” they have silent behavioural divergence:

Step Sync (generate_crew_and_kickoff) Async (_arun_framework)
AutoGen v0.2 / v0.4 version selection missing L744-763
AgentOps init from AGENTOPS_API_KEY missing L765-772
_validate_cli_backend_compatibility missing L783-784
initial_adapter.resolve() (variant resolution) L599-600 missing
init_observability(adapter.name) L607-608 missing
adapter.setup(framework_tag=...) L611 missing
Tool instantiation strategy resolve(name, instantiate=True) (L561) resolve(name) + manual call (L702-713)

Grep confirms:

$ grep -n 'init_observability\|_validate_cli_backend_compatibility\|adapter.setup(\|agentops' \
       src/praisonai/praisonai/agents_generator.py
41:    "AGENTOPS_AVAILABLE": "agentops",
346:    def _validate_cli_backend_compatibility(self, config, framework):
607:        from .observability.hooks import init_observability
608:        init_observability(adapter.name)
611:        adapter.setup(framework_tag=adapter.name)
765:        # Initialize AgentOps if configured
769:                import agentops
770:                agentops.init(agentops_api_key, default_tags=[framework])
784:        self._validate_cli_backend_compatibility(config, framework)

Each of these symbols appears exactly once β€” i.e., each setup step exists in exactly one of the two paths.

Concrete tool-instantiation difference:

# Sync, agents_generator.py:561
resolved_tool = self.tool_resolver.resolve(tool_name, instantiate=True)
if resolved_tool is not None:
    tools_dict[tool_name] = resolved_tool
# Async, agents_generator.py:702-713
resolved_tool = self.tool_resolver.resolve(tool_name)
if resolved_tool is None:
    self.logger.warning(f"Tool '{tool_name}' not found")
    continue
tools_dict[tool_name] = (
    resolved_tool() if inspect.isclass(resolved_tool) else resolved_tool
)

The sync path's resolve(..., instantiate=True) swallows constructor exceptions inside ToolResolver; the async path lets them surface to its try/except. These are not equivalent.

Why it matters

  1. Observability blind spot in async mode. Long-running production servers use the async path; they get no init_observability() and no adapter.setup() hook. The shape that's most exposed to production traffic is the one that's under-instrumented.
  2. AutoGen version pinning is async-only. Users running praisonai run (sync) on a system where only AutoGen 0.4 is installed get no version switch.
  3. --cli-backend validation silently no-ops in sync. Misconfiguration produces a cryptic downstream error instead of the explicit _validate_cli_backend_compatibility message.
  4. Future drift compounds. Every new feature has to be added in two places; whoever forgets one creates another mode-specific bug.

Fix

Extract a single _prepare() method called by both entry points. The only thing that should differ is adapter.run(...) vs await adapter.arun(...):

# src/praisonai/praisonai/agents_generator.py
def _prepare(self) -> tuple[dict, FrameworkAdapter, dict]:
    """Shared by sync and async entry points."""
    config = self._load_config()
    if self.cli_config:
        self._merge_cli_config(config, self.cli_config)
    config = self._canonicalize_config(config)
    self._validate_agents_config(config)

    tools_dict = self._collect_tools(config)              # includes tools.py load
    framework = self._select_framework(config)            # incl. autogen v2/v4
    self._init_agentops(framework)
    self._validate_cli_backend_compatibility(config, framework)

    adapter = self._get_framework_adapter(framework).resolve()
    init_observability(adapter.name)
    adapter.setup(framework_tag=adapter.name)
    self.framework, self.framework_adapter = adapter.name, adapter
    return config, adapter, tools_dict

def generate_crew_and_kickoff(self):
    config, adapter, tools_dict = self._prepare()
    return adapter.run(
        config, self.config_list, config.get('input', ''),
        tools_dict=tools_dict,
    )

async def _arun_framework(self, _=None):
    config, adapter, tools_dict = self._prepare()
    return await adapter.arun(
        config, self.config_list, config.get('input', ''),
        tools_dict=tools_dict,
    )

Tool instantiation should also collapse to a single strategy inside _collect_tools, with ToolResolver.resolve(..., instantiate=True) (or its async-aware counterpart) being the only call site.


3. RateLimiter.acquire serializes all bot callers and leaks _channel_last_send forever

File: src/praisonai/praisonai/bots/_rate_limit.py:85-120

The bug

# src/praisonai/praisonai/bots/_rate_limit.py:85-120
async def acquire(self, channel_id: Optional[str] = None) -> None:
    async with self._lock:                                  # global lock
        now = time.monotonic()
        elapsed = now - self._last_refill
        self._tokens = min(
            self._config.burst_size,
            self._tokens + elapsed * self._config.messages_per_second
        )
        self._last_refill = now

        if self._tokens < 1.0:
            wait_time = (1.0 - self._tokens) / self._config.messages_per_second
            await asyncio.sleep(wait_time)                  # sleep holding the lock
            self._tokens = 1.0

        self._tokens -= 1.0

        if channel_id:
            last_send = self._channel_last_send.get(channel_id, 0.0)
            channel_elapsed = now - last_send               # `now` is stale
            if channel_elapsed < self._config.per_channel_delay:
                wait_time = self._config.per_channel_delay - channel_elapsed
                await asyncio.sleep(wait_time)              # sleep holding the lock
            self._channel_last_send[channel_id] = time.monotonic()   # unbounded dict

Three concrete problems:

  1. Lock held across await asyncio.sleep (lines 106 and 119). When one caller waits, every other caller β€” including ones targeting different channels with plenty of tokens β€” blocks on self._lock. Token-bucket semantics promise concurrency proportional to messages_per_second; the implementation delivers strict serialization regardless of channel. Telegram's PLATFORM_LIMITS of 25 msg/sec is effectively pinned to 1 in-flight send.
  2. now captured pre-sleep is reused after the global-token sleep (line 115). Real wall time has advanced by wait_time seconds, but now is still the pre-sleep value. channel_elapsed is underestimated β†’ redundant double-wait during bursts.
  3. _channel_last_send is never evicted. Discord / Slack / Telegram bots routinely see thousands of unique channel IDs over a process lifetime (DMs, ephemeral threads, deleted channels). Grep confirms only .get(...), [...] = ..., and .clear() in reset() β€” no scoped eviction. Steady memory leak + a slowdown of every subsequent acquire() as the hash table grows.

Why it matters

This sits on the bot send path β€” every outbound user message goes through acquire(). Under load:

  • Effective throughput collapses to the slowest channel's rate.
  • Memory grows linearly with unique channel count for the life of the process.
  • Pre-sleep now causes redundant waits during burst traffic.

Fix

Decouple state mutation (under lock) from sleeping (lock-free), and bound the channel dict with an LRU:

# src/praisonai/praisonai/bots/_rate_limit.py
from collections import OrderedDict

class RateLimiter:
    _CHANNEL_CAPACITY = 4096  # tune via config

    def __init__(self, config=None):
        self._config = config or RateLimitConfig()
        self._tokens = float(self._config.burst_size)
        self._last_refill = time.monotonic()
        self._channel_last_send: "OrderedDict[str, float]" = OrderedDict()
        self._lock = asyncio.Lock()

    async def acquire(self, channel_id: Optional[str] = None) -> None:
        # Phase 1: under lock, compute waits + reserve token + update last_send.
        async with self._lock:
            now = time.monotonic()
            self._tokens = min(
                self._config.burst_size,
                self._tokens + (now - self._last_refill) * self._config.messages_per_second,
            )
            self._last_refill = now

            global_wait = 0.0
            if self._tokens < 1.0:
                global_wait = (1.0 - self._tokens) / self._config.messages_per_second
                self._tokens = 1.0  # reserved; consumed below
            self._tokens -= 1.0

            channel_wait = 0.0
            if channel_id:
                last = self._channel_last_send.pop(channel_id, 0.0)
                projected_now = now + global_wait
                elapsed = projected_now - last
                if elapsed < self._config.per_channel_delay:
                    channel_wait = self._config.per_channel_delay - elapsed
                # LRU touch + bounded insertion
                self._channel_last_send[channel_id] = projected_now + channel_wait
                while len(self._channel_last_send) > self._CHANNEL_CAPACITY:
                    self._channel_last_send.popitem(last=False)

        # Phase 2: sleep OUTSIDE the lock so other channels proceed concurrently.
        total_wait = global_wait + channel_wait
        if total_wait > 0:
            await asyncio.sleep(total_wait)

This preserves the token-bucket math, removes serialization, fixes the stale-now double-wait, and caps memory.


Honorable mentions (not in top 3, worth flagging during the fix work)

  • tool_resolver.py:273 β€” fast-path cache read bypasses the instantiate flag. If the same tool is resolved with mixed instantiate=True/False calls, the cached value can come back as either a class or an instance depending on first-call order.
  • bots/_session.py:113-125 β€” _get_lock / _get_agent_lock lazily insert asyncio.Lock() into a dict; reap_stale / reset evict locks via .pop(...) without checking whether they're currently held, which can break in-flight chats on a shared agent.

Validation summary

Each finding was confirmed by directly reading the offending file in this tree:

Finding Verified by
Sandbox traversal Read subprocess.py L269-336, docker.py L340-407, sandlock.py L167-184. Confirmed realpath/startswith guard exists only in sandlock.py.
Generator sync/async divergence Read agents_generator.py L472-626 and L660-795 in full. Grep for init_observability, _validate_cli_backend_compatibility, adapter.setup(, agentops shows each symbol appearing in exactly one of the two methods.
Rate limiter serialization + leak Read _rate_limit.py end-to-end. Confirmed both await asyncio.sleep(...) calls are inside async with self._lock:, now is captured once at L92 and reused at L115, and _channel_last_send has no eviction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaudeAuto-trigger Claude analysisdocumentationImprovements or additions to documentationperformance

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions