A Retrieval Augmented Generation (RAG) system that transforms your Outline knowledge base into an intelligent, queryable AI assistant. This project ingests documentation from Outline, processes it using advanced chunking techniques, and provides a conversational interface powered by large language models.
β οΈ Disclaimer: This is not a production-ready project, read the Project Reflection & Recommendations before using this project in a real-world scenario.
This system creates a bridge between your Outline documentation and AI, enabling you to:
- Query your knowledge base using natural language questions
- Get contextual answers based on your actual documentation content
- Access your knowledge through multiple interfaces (chat, MCP server)
- Maintain up-to-date embeddings of your documentation
- π Ingestion: Fetches and processes documents from Outline API
- π§ Intelligent Chunking: Uses agentic chunking to create semantically meaningful document segments
- π Vector Search: Leverages PostgreSQL with pgvector for fast similarity search
- π€ AI-Powered Responses: Uses LLM models for generation and embeddings
- π MCP Integration: Exposes functionality through Model Context Protocol for external clients
- π¬ Interactive Chat: Exposes a cli-base chat interface for querying your RAG locally
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Outline API βββββΆβ Ingestion βββββΆβ Vector Store β
β (Documents) β β (Agentic Chunker)β β(PostgreSQL + β
β β β β β pgvector) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Chat Interface ββββββ RAG Workflow ββββββ Retriever β
β or MCP Client β β (LangGraph + β β β
β β β LangChain) β β β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
- Node.js 18+ and npm
- PostgreSQL database with pgvector extension
- Outline instance with API access
- API keys for Groq and Hugging Face
git git clone https://github.com/mateodevia/agentic-rag-outline.git
cd agentic-rag-outline
npm installCreate a .env file with the following variables:
# Database Configuration
PG_CONNECTION_STRING=postgresql://user:password@localhost:5432/your_db
# Outline API Configuration
OUTLINE_URL=https://your-outline-instance.com
OUTLINE_API_KEY=your_outline_api_key
# AI Model Configuration
GROQ_API_KEY=your_groq_api_key
HUGGINGFACEHUB_API_KEY=your_huggingface_api_key
# RAG Configuration (New!)
LANGUAGE=english
COMPANY_CONTEXT=Your Company Name and relevant context for better responses
# Optional: LangSmith Tracing
LANGSMITH_TRACING=false
LANGSMITH_API_KEY=your_langsmith_api_keyEnsure your PostgreSQL database has the pgvector extension installed:
CREATE EXTENSION IF NOT EXISTS vector;# Development mode
npm run dev:ingest
# Production mode
npm run start:ingest# Development mode
npm run dev:chat
# Production mode
npm run start:chatββββββββββββββββββββββββββββββββββββββββββ
β AI Assistant Terminal β
ββββββββββββββββββββββββββββββββββββββββββ
Welcome to the interactive chat interface!
Type your questions or use the following commands:
/context - Toggle context visibility
/exit - Quit the application
/help - Show this help message
βββββββββββββββββββββββββββββββββββββββββββ
You: How do I set up API authentication in Outline?
You: How do I set up API authentication in Outline?
Assistant: To set up API authentication for your Outline instance, you need to configure the following environment variables in your .env file:
1. OUTLINE_URL - Your Outline instance URL (e.g., https://your-outline-instance.com)
2. OUTLINE_API_KEY - Your Outline API key
You can obtain your API key from your Outline settings under the API section. Make sure the API key has the necessary permissions to read documents and collections.
βββββββββββββββββββββββββββββββββββββββββββ
You: /context
Context visibility is now ON
You: What are the main features?
You: What are the main features?
Assistant: Based on your documentation, the main features include:
- Intelligent document ingestion from Outline
- Advanced agentic chunking for better context
- Vector search using PostgreSQL + pgvector
- AI-powered responses using LLM models
- Multiple interfaces (chat and MCP server)
Context: [Retrieved documents would appear here in magenta when context is enabled]
βββββββββββββββββββββββββββββββββββββββββββ
You:
# Development mode
npm run dev:mcp
# Production mode
npm run start:mcpTo get more information on how to setup your MCP server check the MCP-README.md file
- Fetches documents from your Outline instance using the API
- Enriches documents with semantic context (parent documents, collections)
- Intelligent Chunking (
src/rag/agentic-chunker.ts)- Uses an agentic approach to create semantically coherent chunks
- Maintains context and relationships between document sections
- Optimizes chunk size for embedding and retrieval performance
- Generates embeddings and stores them in PostgreSQL with pgvector
- Retrieval: Performs similarity search to find relevant documents
- Generation: Uses retrieved context to generate answers with LLM
- Built with LangGraph for robust pipeline management
- Chat Interface: Direct conversation with your knowledge base
- MCP Server: Standardized protocol for integration with AI assistants
- TypeScript/Node.js: Core runtime and type safety
- Lang Chain & LangGraph: RAG workflow orchestration and LLM management
- PostgreSQL + pgvector: Vector database for embeddings
- Model Context Protocol: Standardized AI tool integration
- Multiple LLMs: Easily interchangeble LLMs for each process
- Groq: Used for chunking and RAG Queriying
- Hugging Face: Used for vector embeding
src/
βββ database/ # Database connection and utilities
β βββ database.ts # PostgreSQL + pgvector setup
β βββ singleton-connection.ts # Connection management
βββ interfaces/ # User interfaces
β βββ chat.ts # Interactive CLI chat interface
β βββ mcp-server.ts # Model Context Protocol server
βββ outline-api/ # Outline API integration
β βββ collection-service.ts # Collection operations
β βββ document-service.ts # Document operations & simplification
β βββ types.ts # API response types
βββ rag/ # RAG pipeline components
βββ ingestion.ts # Document ingestion workflow
βββ rag-workflow.ts # Enhanced RAG pipeline with full document context
βββ rag-prompt.ts # Custom RAG prompt templates (New!)
βββ agentic-chunker.ts # Intelligent document chunking
βββ agentic-chunker-prompt.ts # Chunking prompt templates
βββ document-retriever.ts # Document retrieval utilities (New!)
βββ retriever.ts # Vector search configuration
βββ llm-config.ts # LLM model configurations
βββ types.ts # RAG-specific type definitions
This project was developed as an educational exploration of advanced RAG (Retrieval Augmented Generation) architectures, focusing on learning and experimentation rather than production deployment.
Through this implementation, we discovered that simpler approaches often yield better results with significantly less complexity. For production use cases requiring Outline integration, we recommend evaluating simpler alternatives such as this MCP Outline implementation, which offers:
- Reduced complexity: Easier to understand, implement, and maintain
- Faster setup: Minimal configuration requirements
- Lower barrier to entry: Less technical overhead for teams, and lower deployment costs
Before implementing this solution, we encourage you to evaluate simpler alternatives that may better suit your specific use case and technical requirements. This project serves as a valuable learning resource for understanding advanced RAG patterns, but most be over-engineered for many practical applications.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request