A comprehensive AI Agent Platform with tracing, RAG, memory, skills, and workflow orchestration.
- 🔍 Full Observability: SQLite-based tracing with nested span tracking
- 🤖 ReAct Agent Loop: Think → Act → Observe pattern with tool calling
- 📚 RAG System: ChromaDB vector store with semantic search
- 🧠 Memory System: Short-term, long-term, and episodic memory
- ⚡ Skills Framework: Composable agent capabilities
- 📊 Workflow Engine: DAG-based task orchestration
- 🎯 Smart Routing: Cost-aware multi-LLM routing
- Local: Ollama (zero cost)
- Cloud: Anthropic Claude, OpenAI
Supports: PDF, DOCX, TXT, Markdown, CSV
# Clone the repository
git clone https://github.com/avaluev/agent-factory.git
cd agent-factory
# Set up Python environment
pyenv local 3.11.13 # or any Python 3.11+
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .Create a .env file:
# Anthropic (optional, for cloud models)
ANTHROPIC_API_KEY=your_key_here
# OpenAI (optional, for embedding fallback)
OPENAI_API_KEY=your_key_here
# Ollama (local models)
OLLAMA_BASE_URL=http://localhost:11434# Start interactive agent session
agent run
# Ingest documents into knowledge base
agent ingest /path/to/documents
# List available skills
agent skills
# Check system status
agent statusagent-platform/
├── core/ # Agent core (ReAct loop, tool registry, model adapters)
├── tracing/ # Observability (span tracking, trace storage)
├── rag/ # RAG system (embeddings, vector store, ingestion)
├── memory/ # Memory systems (short-term, long-term, episodic)
├── skills/ # Skills framework (loader, executor, builtin skills)
├── workflows/ # Workflow engine (DAG execution, checkpointing)
├── router/ # Multi-LLM routing (cost-aware strategies)
├── mcp/ # MCP integration
└── config/ # Configuration files
User Input → System Prompt + Context → LLM → Tool Calls → Execute Tools → Loop
Every operation creates spans:
agent_run→agent_iteration→llm_call+tool_call- Stored in SQLite with full I/O, tokens, cost
Documents → Load → Chunk → Embed (Ollama) → Store (ChromaDB) → Query → Context
- Short-term: Sliding window conversation buffer
- Long-term: SQLite + vector semantic search
- Episodic: Task execution history with success/failure tracking
pytest tests/# Type checking
mypy core/ rag/ memory/
# Linting
ruff check .All operations are fully traced:
from tracing import Tracer
tracer = Tracer.instance()
# Query recent traces
traces = tracer.store.get_recent_traces(limit=10)
# Get LLM cost summary
summary = tracer.store.get_llm_summary()
print(f"Total cost: ${summary['total_cost_usd']:.4f}")Skills are composable capabilities:
from skills import Skill, SkillResult, SkillStatus
class MySkill(Skill):
def _default_metadata(self):
return SkillMetadata(
name="my_skill",
version="1.0.0",
description="Does something useful"
)
async def execute(self, inputs):
# Your logic here
return SkillResult(
status=SkillStatus.SUCCESS,
output={"result": "done"}
)Define DAG workflows:
from workflows import WorkflowDefinition, WorkflowNode
workflow = WorkflowDefinition(
name="data_pipeline",
nodes=[
WorkflowNode(id="ingest", task="ingest_data"),
WorkflowNode(id="process", task="process_data", depends_on=["ingest"]),
WorkflowNode(id="analyze", task="analyze_results", depends_on=["process"])
]
)Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details
Built with:
- Anthropic Claude - AI reasoning
- Ollama - Local model inference
- ChromaDB - Vector database
- FastAPI - API framework
- Typer - CLI framework
Agent Factory - Build intelligent, observable, and composable AI agents 🚀