Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions docs/best-practices/agent-retry-strategies.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -898,6 +898,40 @@ async def test_hedged_requests():
assert call_count == 2 # Both requests started
```

## Built-in Task Guardrail Retries

PraisonAI provides built-in retry functionality specifically for guardrail validation failures, distinct from the generic `ExponentialBackoffRetry` patterns above.

<Note>
Guardrail retries are handled automatically by the executor when `Task(guardrail=..., max_retries=...)` is configured. This is separate from manual retry implementations.
</Note>

```python
from praisonaiagents import Agent, Task, PraisonAIAgents

def validate_content(output):
"""Built-in guardrail with retry support"""
if len(output.raw.split()) < 50:
return False, "Content too short - needs at least 50 words"
return True, output

task = Task(
description="Write a detailed explanation",
agent=agent,
guardrail=validate_content,
max_retries=3, # Built-in executor-level retry
retry_with_feedback=True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The retry_with_feedback parameter is not supported by the Task class constructor in the current implementation. Including it in this example will result in a TypeError: __init__() got an unexpected keyword argument 'retry_with_feedback'.

    max_retries=3  # Built-in executor-level retry

)

# The executor automatically handles:
# - Guardrail validation
# - Retry logic on failure
# - Feedback to agent on retry
# - Final failure after max_retries
```

This differs from manual retry strategies as it's integrated into the task execution workflow and handles the retry loop at the executor level.

## Conclusion

Implementing robust retry strategies is essential for building resilient multi-agent systems. By choosing the appropriate retry pattern and configuring it correctly, you can handle transient failures gracefully while avoiding issues like retry storms and cascading failures.
31 changes: 31 additions & 0 deletions docs/best-practices/memory-cleanup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,19 @@ class MemoryEfficientConversationManager:

### 2. Agent Memory Management

Memory construction is now thread-safe and async-safe. Concurrent `Task`s sharing a `memory_config` will coordinate through locks rather than each creating duplicate stores.

```python
from praisonaiagents import Agent, Task, PraisonAIAgents

memory_config = {"storage": {"path": "./shared.db"}, "provider": "file"}

agents = [Agent(name=f"A{i}", instructions="Summarize one line.") for i in range(4)]
tasks = [Task(description=f"Summarize doc {i}.", agent=agents[i], config={"memory_config": memory_config}) for i in range(4)]

PraisonAIAgents(agents=agents, tasks=tasks).start()
```

Implement memory limits and cleanup for agents:

```python
Expand Down Expand Up @@ -371,6 +384,24 @@ class AutomaticMemoryManager:
schedule.every(5).minutes.do(conditional_cleanup)
```

### 3. Agent Garbage Collection Safety Net

Since PR #1514, `Agent.__del__` runs a best-effort `close_connections()` during garbage collection as a safety net. However, this may be skipped by the Python interpreter and **must not be relied upon**. Always use explicit cleanup:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation states that Agent.__del__ runs close_connections() as a safety net. However, the implementation in praisonaiagents/agent/agent.py (line 4824) shows that Agent.__del__ is an empty method (pass) and does not perform any cleanup. This claim is currently inaccurate.


```python
# Preferred: Explicit cleanup
agent = Agent(name="Analyst", instructions="Analyze data.")
try:
result = agent.start("Analyze quarterly numbers.")
finally:
agent.close() # Guaranteed cleanup

# Better: Context manager (recommended)
with Agent(name="Analyst", instructions="Analyze data.") as agent:
result = agent.start("Analyze quarterly numbers.")
# Cleanup happens automatically here
```

## Best Practices

### 1. Use Context Managers
Expand Down
31 changes: 30 additions & 1 deletion docs/features/guardrails.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ guardrail = LLMGuardrail(

### Retry Configuration

Configure retry behaviour for failed validations:
Configure retry behaviour for failed validations (works in both sync and async execution paths):

```python
task = Task(
Expand All @@ -267,6 +267,35 @@ task = Task(
)
```

**Execution Order**: Guardrail validation → retry (if failed) or pass → memory/user callbacks → task marked `completed`.

### How Retries Work

When a guardrail validation fails:
1. **Before PR #1514**: Async execution bypassed retry logic
2. **After PR #1514**: Both sync and async execution paths properly retry failed validations

The executor increments `task.retry_count`, sets `task.status = "in progress"`, logs the retry, and continues the execution loop. On final failure after `max_retries`, it raises an exception. When a guardrail returns a modified `TaskOutput` or string, the downstream `task.result` and any memory callbacks receive the **modified** value, not the original.

```python
from praisonaiagents import Agent, Task, PraisonAIAgents

def must_mention_price(output):
ok = "$" in output.raw
return (ok, output if ok else "Rewrite and include a price in USD.")

agent = Agent(name="Writer", instructions="Write a one-line product blurb.")

task = Task(
description="Write a blurb for a coffee mug.",
agent=agent,
guardrail=must_mention_price,
max_retries=3,
)

PraisonAIAgents(agents=[agent], tasks=[task]).start()
```

### Composite Guardrails

Combine multiple validation criteria:
Expand Down
16 changes: 16 additions & 0 deletions docs/features/resource-lifecycle.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ sequenceDiagram
User->>Team: (exit with block)
Team->>Agents: close() each
Team->>Memory: close()
Note right of Memory: Closes SQLite, MongoDB, etc.
Team-->>User: cleanup complete
```

Expand Down Expand Up @@ -208,6 +209,21 @@ async with PraisonAIAgents(agents=[agent]) as workflow:
```
</Accordion>

<Accordion title="MongoDB connections are now included in cleanup">
Since PR #1514, `Memory.close_connections()` also closes MongoDB clients when present. Multiple calls to `close_connections()` are safe (idempotent). Agent `__del__` provides a safety net but should not be relied upon:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are two discrepancies here: 1) Memory.close_connections() in praisonaiagents/memory/memory.py (line 1946) does not contain logic to close MongoDB clients (self.mongo_client). 2) As noted elsewhere, Agent.__del__ is currently a pass and does not provide a safety net.


```python
# Explicit cleanup (preferred)
with Agent(name="Analyst", instructions="Analyze quarterly numbers.") as agent:
agent.start("Summarize Q1 revenue.")
# MongoDB / SQLite / registered connections closed here.

# Async form
async with Agent(name="Analyst", instructions="...") as agent:
await agent.astart("...")
```
</Accordion>

<Accordion title="Don't reuse a team after exiting its with block">
Once you exit a `with` block, consider the AgentTeam closed. Create a new one for additional work.

Expand Down
12 changes: 12 additions & 0 deletions docs/features/thread-safety.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,18 @@ Prior to PR #1488, chat_history mutations bypassed thread-safety locks at 31+ ca
- `chat_history` setter now acquires the `AsyncSafeState` lock for assignments
</Note>

#### What changed in PR #1514

<Note>
PR #1514 enhanced thread-safety in three key areas:

**1. Locked Memory Initialization**: `Task.initialize_memory()` now uses `threading.Lock` with double-checked locking pattern. A new async variant `initialize_memory_async()` uses `asyncio.Lock` and offloads construction with `asyncio.to_thread()` to prevent event loop blocking.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation of Task.initialize_memory() in praisonaiagents/task/task.py (line 531) does not use a threading.Lock or a double-checked locking pattern. Furthermore, the mentioned async variant initialize_memory_async() is not defined in the Task class.


**2. Async-Locked Workflow State**: New `_set_workflow_finished(value)` method uses async locks to safely update workflow completion status across concurrent tasks.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _set_workflow_finished(value) method described here is not present in the AgentTeam (or Agents) implementation in praisonaiagents/agents/agents.py.


**3. Non-Mutating Task Context**: Task execution no longer mutates `task.description` during runs. Per-execution context is stored in `_execution_context` field, keeping the user-facing `task.description` stable across multiple executions.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Task class in praisonaiagents/task/task.py does not contain an _execution_context field. Task execution still appears to rely on instance attributes, which contradicts the claim that task.description is kept stable via a separate context storage.

</Note>

#### Safe operations

```python
Expand Down