A flexible, extensible AI agent backend built with NestJS—designed for running local, open-source LLMs (Llama, Gemma, Qwen, DeepSeek, etc.) via Docker Model Runner. Real-time streaming, Redis messaging, web search, and Postgres memory out of the box. No cloud APIs required!
- Clone the repository
git clone <your-repo-url> cd <your-repo-folder>
- Copy and edit environment variables
cp .env.example .env # Edit .env and fill in your model and service config - Start required services (Redis, PostgreSQL, Local LLM) with Docker Compose
docker compose up -d
- PostgreSQL:
localhost:5433 - Redis:
localhost:6379 - Local LLM runner:
localhost:12434(Model Runner guide)
- PostgreSQL:
- Install dependencies
pnpm install
- Start the development server
pnpm run start:dev
See .env.example for all options. Key variables:
MODEL_BASE_URL— e.g.http://localhost:12434/engines/llama.cpp/v1MODEL_NAME— e.g.ai/gemma3:latest,llama-3,qwen,deepseekTAVILY_API_KEY— for web search (Get your key)REDIS_HOST,REDIS_PORT, etc.POSTGRES_*— for memory
- 🤖 Local, open-source LLMs (Llama, Gemma, Qwen, DeepSeek, etc.)
- 🌊 Real-time streaming responses
- 💾 Conversation history with Postgres memory
- 🌐 Web search integration (Tavily)
- 🧵 Custom ThreadService for conversations
- 📡 Redis pub/sub for real-time messaging
- 🎯 Clean, maintainable architecture
- This project is designed for local LLMs only, using Docker Model Runner.
- Supported models: Llama, Gemma, Qwen, DeepSeek, and other open-source models.
- Set
MODEL_BASE_URLandMODEL_NAMEin your.env. - Start the
ai_runnerservice with Docker Compose. - For other providers, see Agent Initializr.
- Set
TAVILY_API_KEYin.env - Example usage in code:
AgentFactory.createAgent( ModelProvider.LOCAL, [new TavilySearch({ maxResults: 5, topic: 'general' })], postgresCheckpointer, );
src/
├── agent/ # AI agent implementation
├── api/ # HTTP endpoints and DTOs
└── messaging/ # Redis messaging service
POST /api/agent/chat— Send a message to the agentGET /api/agent/stream— Stream agent responses (SSE)GET /api/agent/history/:threadId— Get conversation historyGET /api/agent/threads— List all threads
For a ready-to-use frontend, use agentailor-chat-ui, which is fully compatible with this backend.
This project uses Postgres for memory. You must initialize the checkpointer before chatting:
// In agentService
async stream(message: SseMessageDto): Promise<Observable<SseMessage>> {
const channel = `agent-stream:${message.threadId}`;
// Run only once
this.agent.initCheckpointer();
// ...rest of code
}- This project is opinionated for local, open-source LLMs only.
For more details and project resources, visit Initializr.