This project is a high-performance experimentation environment for AI Agents. It was designed to serve as a testing laboratory where developers can validate decision logic, tool chains, and state persistence. The system is flexible enough to operate with a fully mocked RAG (Retrieval-Augmented Generation) infrastructure for rapid testing or can be integrated with production RAG APIs, serving as a solid foundation that can be evolved into real-world use cases.
- Multi-LLM Provider: Native integration with Google Gemini and OpenAI GPT, allowing you to switch between models via environment variables.
- Engine: Node.js + TypeScript
- Database: SQLite for memory persistence and full data traceability.
- Validation: Zod for rigorous validation of tools and configurations.
- Security Engine: Multi-layered defense system with Regex Sanitization and LLM-based Guardrails.
- Security Guardrails (Defense in Depth): - Physical Layer: Regex-based sanitization to prevent delimiter collision (Triple Quotes & XML tags).
- Semantic Layer: Specialized
SecurityServicethat uses a dedicated LLM call to detect "Prompt Injection", "System Leakage", and "Instruction Override". - Structural Layer: Automatic wrapping of user input in
<user_query>tags and triple quotes for structural isolation.
- Semantic Layer: Specialized
- Dual RAG Strategy: Supports a mocked knowledge base for fast validation or real RAG APIs for high-fidelity testing.
- Data Provenance: The system tracks and saves the source of information, allowing you to audit which documents or tools were consulted for each answer.
- Idempotency Architecture: Strict control via
stepId, ensuring consistency during repeated automated test runs. - Memory Compression: Built-in
[SUMMARIZER]module that automatically triggers "Compressing history..." every 6 messages, optimizing context and reducing token costs. - Flexible Reporting: Performance and cost metrics can be viewed in real-time on the screen or exported for later analysis.
- Production-Ready Evolution: Although designed as a playground, the architecture follows software standards (SOLID) that allow for a direct transition to a production environment.
- Install dependencies:
npm installCreate a .env file in the root directory:
# Options: 'gemini' or 'openai'
LLM_PROVIDER=gemini
# Model Versions
GEMINI_MODEL=gemini-1.5-flash
OPENAI_MODEL=gpt-4o
AI_COST_PER_1K_TOKENS=0.002
# --- API KEYS ---
GEMINI_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
# --- APP CONFIG ---
NODE_ENV=development
ENABLE_COST_LOGS=true
ENABLE_IDEMPOTENCY_CHECK=trueThe project includes a set of scripts for development, cleaning, and reporting:
npm run dev: Runs the project in development mode using tsx.
npm run build: Compiles the TypeScript code to JavaScript.
npm run start: Runs the compiled version from the dist folder.
npm run report: Generates a visual dashboard with session metrics.
npm run export: Exports detailed execution logs for external analysis.
npm run clean:db: Deletes the SQLite database and temporary WAL files.
npm run reset: Cleans the database and starts a fresh development run.The TelemetryService monitors every interaction to ensure total control over the agent:
- Action: Logs whether there was a direct response or tool usage (with latency logging).
- Tokens & Costs: Real-time calculation of token consumption and estimated cost per session.
- Reports: Terminal visualization with the option to export detailed execution logs.