A complete Retrieval-Augmented Generation (RAG) system built with TypeScript for React Native component documentation. This system provides intelligent search and question-answering capabilities over React Native component documentation using local models.
- Interactive Chat Interface: Modern web UI with real-time streaming responses
- MCP Server: Model Context Protocol server for AI assistant integration
- Server-Sent Events (SSE): Live streaming of AI responses for better UX
- Local RAG Pipeline: Complete RAG implementation using LangChain.js
- Vector Search: LanceDB for efficient similarity search
- Local Embeddings: @xenova/transformers for text embeddings
- Local LLM: Llama3 via Ollama for answer generation
- REST API: Express.js with comprehensive endpoints
- Context Display: Visual representation of retrieved documents
- Message History: Persistent chat history with localStorage
- Responsive Design: Works seamlessly on desktop and mobile
- Swagger Documentation: Interactive API documentation
- TypeScript: Full type safety and modern development experience
- Modular Architecture: Clean, extensible codebase
- Node.js >= 20.0.0
- Ollama installed and running locally
- Llama3 model pulled in Ollama
# Install Ollama (macOS)
brew install ollama
# Start Ollama service
ollama serve
# Pull Llama3 model
ollama pull llama3- Clone and install dependencies:
cd rn-base-component-rag
npm install- Configure environment:
cp .env.example .env
# Edit .env file with your preferences- Start the server:
npm startThe server will automatically:
- Initialize the embedding model
- Load and index all documentation from
./docs - Start the API server on port 3000 (configurable)
- Serve the interactive chat interface at http://localhost:3000
Configure the system via environment variables in .env:
# Server Configuration
PORT=3000
NODE_ENV=development
# Model Configuration
MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434
# Embedding Model Configuration
EMBEDDING_MODEL=Xenova/bge-base-en-v1.5
# Vector Database Configuration
LANCEDB_PATH=./data/lancedb
VECTOR_DIMENSION=384
# RAG Configuration
TOP_K_RESULTS=5
CHUNK_SIZE=1000
CHUNK_OVERLAP=200Access the interactive chat interface at http://localhost:3000
- Real-time Streaming: Responses stream in real-time using Server-Sent Events
- Context Sidebar: View retrieved documents that inform each response
- Message History: Persistent chat history saved locally
- Component Tags: Quick access to all available React Native components
- Example Questions: Pre-built queries to get started quickly
- Responsive Design: Works on desktop, tablet, and mobile devices
- Configurable Settings: Adjust streaming mode and context document count
- Open http://localhost:3000 in your browser
- Type your question about React Native components
- Watch the AI respond in real-time with streaming text
- View the context documents used to generate the response
- Continue the conversation with follow-up questions
The system includes a Model Context Protocol (MCP) server that allows AI assistants like Claude or Cursor to directly access your React Native documentation.
Development mode (with TypeScript compilation):
npm run mcpProduction mode (using compiled JavaScript):
npm run build # First build the project
npm run mcp:prodAdd this to your MCP client configuration (e.g., Cursor settings):
For Development:
{
"mcpServers": {
"rn-base-component": {
"command": "npm",
"args": ["run", "mcp"],
"cwd": "/path/to/rn-base-component-rag"
}
}
}For Production (recommended):
{
"mcpServers": {
"rn-base-component": {
"command": "npm",
"args": ["run", "mcp:prod"],
"cwd": "/path/to/rn-base-component-rag"
}
}
}-
retrieve_context: Search documentation with natural language queriesawait callTool('retrieve_context', { question: 'How to customize Button styling and handle press events?', limit: 5 });
-
search_by_metadata: Filter documentation by component name or metadataawait callTool('search_by_metadata', { filters: { component: 'Button' }, limit: 10 });
-
get_stats: Get system statistics and configurationawait callTool('get_stats', {});
rn-component://<ComponentName>: Access complete documentation for specific componentsrn-components://overview: Get an overview of all available components
// Access complete Button documentation
const buttonDocs = await readResource('rn-component://Button');
// Get system overview
const overview = await readResource('rn-components://overview');- Direct AI Access: AI assistants can query your documentation without manual copy-pasting
- Context-Aware Responses: AI gets relevant, up-to-date information about your components
- Standardized Interface: Uses MCP protocol for consistent integration across different AI tools
- Real-time Updates: Always accesses the latest indexed documentation
POST /api/chat/stream - Streaming chat with Server-Sent Events
curl -X POST http://localhost:3000/api/chat/stream \
-H "Content-Type: application/json" \
-d '{"query": "How to implement form validation?"}'POST /api/chat/message - Regular chat response
curl -X POST http://localhost:3000/api/chat/message \
-H "Content-Type: application/json" \
-d '{"query": "What is Button component?"}'POST /api/retrieve
curl -X POST http://localhost:3000/api/retrieve \
-H "Content-Type: application/json" \
-d '{"query": "How to use Button component?"}'GET /api/retrieve/component/{componentName}
curl http://localhost:3000/api/retrieve/component/ButtonPOST /api/generate
curl -X POST http://localhost:3000/api/generate \
-H "Content-Type: application/json" \
-d '{"query": "How do I customize Button styling?"}'POST /api/generate/component/{componentName}
curl -X POST http://localhost:3000/api/generate/component/Button \
-H "Content-Type: application/json" \
-d '{"query": "How to customize styling?"}'GET /api/status - System health and statistics POST /api/status/reindex - Force reindex all documents GET /api/status/components - List available components
Interactive Swagger documentation is available at:
http://localhost:3000/api-docs
src/
โโโ index.ts # Main Express server
โโโ api/ # API route handlers
โ โโโ retrieve.ts # Document retrieval endpoints
โ โโโ generate.ts # Answer generation endpoints
โ โโโ status.ts # System status endpoints
โโโ rag/ # RAG pipeline components
โ โโโ pipeline.ts # Main RAG orchestrator
โ โโโ embedder.ts # Text embedding using @xenova/transformers
โ โโโ loader.ts # Document loading and chunking
โ โโโ vectorStore.ts # LanceDB vector database
โ โโโ retriever.ts # Similarity search and ranking
โ โโโ generator.ts # LLM answer generation via Ollama
โโโ config/ # Configuration and utilities
โโโ modelConfig.ts # Model and system configuration
โโโ logger.ts # Winston logging setup
โโโ swagger.ts # OpenAPI documentation
Development mode with auto-reload:
npm run devBuild TypeScript:
npm run buildRun tests:
npm testThe system includes comprehensive logging for performance monitoring:
- Embedding Generation: Time to generate embeddings
- Vector Search: Similarity search performance
- LLM Generation: Answer generation timing
- End-to-End: Complete RAG pipeline timing
Update the EMBEDDING_MODEL environment variable:
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2
# or
EMBEDDING_MODEL=Xenova/all-mpnet-base-v2Change the Ollama model:
MODEL=llama3:70b
# or
MODEL=mistral
# or
MODEL=codellamaOptimize for your document structure:
CHUNK_SIZE=1500 # Larger chunks for more context
CHUNK_OVERLAP=300 # More overlap for better continuity
TOP_K_RESULTS=10 # More context documents-
Ollama Connection Error
- Ensure Ollama is running:
ollama serve - Check the base URL in
.env
- Ensure Ollama is running:
-
Model Not Found
- Pull the model:
ollama pull llama3 - Verify available models:
ollama list
- Pull the model:
-
Memory Issues
- Reduce
CHUNK_SIZEandTOP_K_RESULTS - Use smaller embedding models
- Reduce
-
Slow Performance
- Use quantized models
- Reduce vector dimensions
- Optimize chunk sizes
Check logs for detailed error information:
tail -f logs/combined.log
tail -f logs/error.log- MCP Provider: Adapt for Model Context Protocol
- Multiple Vector Stores: Support for different databases
- Hybrid Search: Combine semantic and keyword search
- Caching: Redis for response caching
- Authentication: API key management
- Rate Limiting: Request throttling
- Monitoring: Metrics and alerting
MIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
Built with โค๏ธ for the React Native community