Skip to content

docs: update guardrails, thread-safety, and resource-lifecycle pages for PR #1514 architectural fixes #228

@MervinPraison

Description

@MervinPraison

Context

This issue tracks documentation updates required by the architectural fixes landed in MervinPraison/PraisonAI#1514 (merged 2026-04-22, fixes issue #1507).

PR #1514 changes runtime behavior in three areas that are user-visible and already documented. The existing docs pages now describe behavior that no longer matches the SDK. Any future agent reading the docs will write code against behaviors that changed, so these pages must be corrected.

Placement rule reminder (per AGENTS.md §1.8): do not edit docs/concepts/. All updates below go to docs/features/ and docs/best-practices/. docs/concepts/guardrails.mdx is mentioned only as a cross-reference — do not modify it without explicit human approval.


Source of truth — PR #1514

PR: MervinPraison/PraisonAI#1514
Head SHA: 971d217c44d8643b77b4aa13fc3da94f4a3da8e6
Files changed (in praisonai-package/src/praisonai-agents/ → map to repo-root praisonaiagents/ in PraisonAIDocs):

  • praisonaiagents/agent/agent.py (+18 / −2)
  • praisonaiagents/agents/agents.py (+34 / −2)
  • praisonaiagents/memory/memory.py (+9 / −0)
  • praisonaiagents/process/process.py (+31 / −17)
  • praisonaiagents/task/task.py (+77 / −60)
  • test_architectural_fixes.py (tests)

Before writing any documentation content, an implementing agent must read the SDK files above from the synced praisonaiagents/ tree at repo root (per AGENTS.md §1.1 and §1.3 — SDK-first verification).


Gap 1 — Guardrail retry now actually runs

What changed in the SDK

  • praisonaiagents/agents/agents.py:1017-1038 — guardrail validation was moved out of the callback and into the main async execution loop (arun_task).
  • praisonaiagents/task/task.py — the guardrail branch was removed from execute_callback(). execute_callback now only runs memory/user callbacks. The docstring explicitly says: "Guardrail validation has been moved to the execution path in agents.py to ensure proper retry behavior."
  • execute_callback_sync() no longer uses fire-and-forget loop.create_task(...). It always goes through run_coroutine_from_any_context(...) so exceptions from the callback propagate instead of being silently swallowed.
  • On guardrail failure the executor now does the real retry: increments task.retry_count, sets task.status = "in progress", logs the retry, and continues the loop. On final failure it raises "Task failed guardrail validation after {max_retries} retries".
  • On guardrail success with a modified result, task_output.raw (or the whole TaskOutput) is replaced and task.result is updated before the task is marked completed.

Why this matters for docs

docs/features/guardrails.mdx (and docs/best-practices/agent-retry-strategies.mdx where it intersects with guardrails) already tell users that max_retries / retry_delay / retry_with_feedback drive retries on failed validations. That used to be partially true only in the sync path; in async it was bypassed. Users who wrote production async workflows before PR #1514 may have been silently losing validation failures. Docs should:

  1. State plainly that guardrail retries apply to both sync and async execution paths.
  2. Make clear the retry happens before the task is marked completed (ordering matters for memory/user callbacks — those only fire on a guardrail-passing result).
  3. Clarify that when a guardrail returns a modified TaskOutput/str, the downstream task.result and any memory callback receive the modified value, not the original.
  4. Remove (or correct) any older wording implying that a failed guardrail in async mode would be logged but not retried.

Files to update

File Action
docs/features/guardrails.mdx Update. Audit the "Retry behaviour" / "How retries work" sections (around lines 257–266, 465–474, 682, 893). Add a short "Execution order" subsection: guardrail → retry-or-pass → memory/user callbacks → completed.
docs/best-practices/agent-retry-strategies.mdx Update. Add a short subsection or callout clarifying that Task(guardrail=..., max_retries=...) drives first-class retries inside the executor, distinct from the generic ExponentialBackoffRetry patterns the page already teaches.

Minimum content to add (agent-centric, beginner-friendly, per AGENTS.md §1.1 rule 9)

from praisonaiagents import Agent, Task, PraisonAIAgents

def must_mention_price(output):
    ok = "$" in output.raw
    return (ok, output if ok else "Rewrite and include a price in USD.")

agent = Agent(name="Writer", instructions="Write a one-line product blurb.")

task = Task(
    description="Write a blurb for a coffee mug.",
    agent=agent,
    guardrail=must_mention_price,
    max_retries=3,
)

PraisonAIAgents(agents=[agent], tasks=[task]).start()

Do not introduce a new page — extend the existing one.


Gap 2 — Thread-safe state: _set_workflow_finished, _execution_context, and locked memory init

What changed in the SDK

  • praisonaiagents/process/process.py
    • New async-locked setter _set_workflow_finished(value) backed by self._get_state_lock().
    • _check_all_tasks_completed() is now async and uses the setter; a new sync sibling _check_all_tasks_completed_sync() is used from the sync workflow path.
    • Both the async (aworkflow) and sync (workflow) paths no longer mutate task.description. The per-execution context is now stored in a dedicated current_task._execution_context field. The previous destructive task.description.split('Input data from previous tasks:')[0] reset and the task.description = task._original_description + context concatenation have been removed.
  • praisonaiagents/task/task.py
    • initialize_memory() now uses a threading.Lock with double-checked locking.
    • New async def initialize_memory_async(self) uses an asyncio.Lock and offloads Memory(...) construction with asyncio.to_thread(...) so it doesn't block the event loop.
    • Config access is defensive: self.config.get('memory_config', {}).get('storage', {}).get('path') instead of the old self.config['memory_config']['storage']['path'].
    • execute_callback() now calls await self.initialize_memory_async() (not the sync one) when running in async context.

Why this matters for docs

docs/features/thread-safety.mdx and docs/best-practices/memory-cleanup.mdx already describe concurrent use, but neither mentions:

  • That concurrent tasks sharing a memory_config are now safe to initialize from many threads/coroutines (previously a benign-looking race).
  • That user code should not read task.description expecting per-execution context to be appended to it — that mutation is gone. The per-run context lives on _execution_context. This is the documented boundary: user-facing task.description is now stable across runs.
  • That the sync and async workflow paths use different completion-check methods (informational, but relevant for anyone subclassing Process).

Files to update

File Action
docs/features/thread-safety.mdx Update. Add a subsection "What changed in PR #1514" mirroring the existing "What changed in PR #1488" callout at line 46. Cover: locked memory init (sync + async variants), async-locked workflow_finished, and non-mutating per-run task context.
docs/best-practices/memory-cleanup.mdx Update. In section "2. Agent Memory Management" (line 90), add a short note that Memory construction is now thread-/async-safe and that concurrent Tasks sharing a memory_config will coordinate through the lock rather than each creating a duplicate store.

Minimum content to add

A thread-safety snippet showing the supported case (multiple tasks, one shared memory_config, concurrent init):

from praisonaiagents import Agent, Task, PraisonAIAgents

memory_config = {"storage": {"path": "./shared.db"}, "provider": "file"}

agents  = [Agent(name=f"A{i}", instructions="Summarize one line.") for i in range(4)]
tasks   = [Task(description=f"Summarize doc {i}.", agent=agents[i], config={"memory_config": memory_config}) for i in range(4)]

PraisonAIAgents(agents=agents, tasks=tasks).start()

Do not document the private names (_set_workflow_finished, _execution_context, _memory_init_lock) as user-facing API — they are internal. Just describe the user-observable guarantees.


Gap 3 — Agent/Memory lifecycle: MongoDB cleanup + lightweight __del__

What changed in the SDK

  • praisonaiagents/memory/memory.pyclose_connections() now also closes the MongoDB client if one exists:
    if hasattr(self, 'mongo_client') and self.mongo_client:
        self.mongo_client.close()
        self.mongo_client = None
  • praisonaiagents/agent/agent.py
    • New instance flag self._closed = False initialized in __init__.
    • __del__ changed from a no-op ("Destructor safely does nothing to avoid GC pollution in test loops") to a lightweight finalizer that, if _closed is still False, calls self._memory_instance.close_connections() inside a try/except and then sets _closed = True. Exceptions during GC are swallowed (finalizers must not raise).

Why this matters for docs

docs/features/resource-lifecycle.mdx and docs/best-practices/memory-cleanup.mdx already teach async with / with / explicit .close(). What they do not yet cover:

  1. MongoDB users specifically now get their client closed when Memory.close_connections() runs. Previously this was a real leak in long-running apps using Mongo-backed memory.
  2. GC-time cleanup is now a safety net, not the recommended path. Docs must still tell users to use context managers or call .close() explicitly — the __del__ fallback is intentionally minimal and silent, and is not a substitute for explicit cleanup.
  3. Calling .close_connections() twice is safe (idempotent via _closed / mongo_client = None).

Files to update

File Action
docs/features/resource-lifecycle.mdx Update. Extend "How It Works" (line 69) and "Best Practices" (line 182) with: (a) MongoDB included in close_connections, (b) __del__ as a safety net only, (c) idempotency note. Keep the existing async with / explicit .close() guidance front-and-center.
docs/best-practices/memory-cleanup.mdx Update. In "3. Resource Pool Management" (line 156), add MongoDB to the list of connections cleaned up. In "Automatic Garbage Collection" (line 323), clarify that the new Agent.__del__ runs a best-effort close_connections() but may be skipped by the interpreter and must not be relied on.

Minimum content to add

Explicit-cleanup example (preferred):

from praisonaiagents import Agent

with Agent(name="Analyst", instructions="Analyze quarterly numbers.") as agent:
    agent.start("Summarize Q1 revenue.")
# MongoDB / SQLite / registered connections closed here.

Async form (mention that aclose() and async with both route through the same cleanup):

async with Agent(name="Analyst", instructions="...") as agent:
    await agent.astart("...")

Out of scope / do not touch

  • docs/concepts/guardrails.mdx — concepts folder is human-approved only (AGENTS.md §1.8).
  • docs/js/**, docs/rust/**, and anything under docs/sdk/reference/** — auto-generated, do not edit by hand (AGENTS.md §1.7).
  • docs.json — only update if a new page is added. This issue requests updates to existing pages, so no docs.json changes should be needed. If a new sub-page is considered, it must be placed under docs/features/ and added to the "Features" group only (not "Concepts").

Acceptance checklist for the implementing agent

Before opening the PR, confirm:

  • Read the 5 SDK files listed under "Source of truth" from repo-root praisonaiagents/ (not src/). Verify every behavior claim in the updated docs against that source. No guessing.
  • docs/features/guardrails.mdx updated: retries work in sync + async; execution order (guardrail → retry/pass → memory/user callbacks → completed); modified TaskOutput propagates.
  • docs/best-practices/agent-retry-strategies.mdx updated: short callout tying Task(guardrail=..., max_retries=...) to the built-in executor-level retry.
  • docs/features/thread-safety.mdx updated: new "What changed in PR #1514" subsection in the style of the existing PR #1488 one.
  • docs/best-practices/memory-cleanup.mdx updated: concurrent memory init is safe; GC cleanup is a safety net, not the path.
  • docs/features/resource-lifecycle.mdx updated: MongoDB included in close_connections; __del__ described as best-effort; idempotency noted.
  • No changes to docs/concepts/, docs/js/, docs/rust/, or docs/sdk/reference/.
  • All code examples are copy-paste runnable, use the friendly from praisonaiagents import ... imports (AGENTS.md §6.1), and are agent-centric (top of any new section).
  • Every new/updated section has a Mermaid diagram using the standard color scheme where the content teaches a flow or decision (AGENTS.md §3).
  • No private names (_set_workflow_finished, _execution_context, _memory_init_lock, _closed) surfaced as user API.
  • Frontmatter (title, sidebarTitle, description, icon) preserved; existing <Steps>, <AccordionGroup>, <CardGroup> structure respected.

cc @MervinPraison

Metadata

Metadata

Assignees

No one assigned

    Labels

    claudeTrigger Claude Code analysisdocumentationImprovements or additions to documentationupdate

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions