You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,19 +1,19 @@
1
-
# Mimir
1
+
##Mimir
2
2
3
-
A comprehensive **contextual RAG** (Retrieval Augmented Generation) system with MCP (Model Context Protocol) integration for both **documentation and TypeScript codebases**. Mimir ingests documentation and TypeScript code from GitHub repositories into a Supabase vector store and provides powerful querying capabilities through both REST API and MCP protocol. Unlike basic RAG, contextual RAG provides rich context around each code entity, including full file content, imports, and surrounding code.
3
+
A comprehensive **contextual RAG** (Retrieval Augmented Generation) system with MCP (Model Context Protocol) integration for both **documentation and codebases**. Mimir ingests documentation and source code (currently **TypeScript** and **Python**, with more languages planned) from GitHub repositories into a Supabase vector store and provides powerful querying capabilities through both REST API and MCP protocol. Unlike basic RAG, contextual RAG provides rich context around each code entity, including full file content, imports, and surrounding code.
4
4
5
5
## Projects
6
6
7
7
This repository contains two main components:
8
8
9
9
### [mimir-rag](./mimir-rag)
10
10
11
-
The core RAG server that handles ingestion and querying of both **documentation (MDX)** and **TypeScript codebases**.
11
+
The core RAG server that handles ingestion and querying of both **documentation (MDX)** and **codebases** (TypeScript, Python, and easily extensible to more languages).
12
12
13
13
**Features:**
14
-
- Ingests documentation and TypeScript code from GitHub repositories into Supabase vector store
14
+
- Ingests documentation and source code from GitHub repositories into Supabase vector store
15
15
- Supports separate repositories for code and documentation
- MCP endpoint for semantic document search (`/mcp/ask`)
@@ -98,8 +98,8 @@ See the [mimir-mcp README](./mimir-mcp/README.md) for detailed setup instruction
98
98
## Workflow
99
99
100
100
1.**Ingestion Phase:**
101
-
- mimir-rag fetches documentation (MDX) and TypeScript code from configured GitHub repository(ies)
102
-
-TypeScript files are parsed to extract entities (functions, classes, interfaces, exported const functions)
101
+
- mimir-rag fetches documentation (MDX) and code from configured GitHub repository(ies)
102
+
-Code files are parsed to extract language-specific entities (TypeScript entities, Python functions/classes/methods, etc.)
103
103
-**Contextual RAG**: Each entity is enriched with surrounding context - full file content, imports, parent classes, and related code
104
104
- Documents are chunked into smaller segments with rich contextual information
105
105
- Chunks are embedded using your chosen LLM provider
@@ -121,9 +121,9 @@ See the [mimir-mcp README](./mimir-mcp/README.md) for detailed setup instruction
121
121
122
122
## Use Cases
123
123
124
-
-**AI-Powered Code Assistant**: Let your AI coding assistant query your TypeScript codebase in real-time - find functions, classes, and understand code structure
124
+
-**AI-Powered Code Assistant**: Let your AI coding assistant query your codebase in real-time - find functions, classes, and understand code structure (supports TypeScript, Python, and more)
125
125
-**AI-Powered Documentation Assistant**: Let your AI coding assistant query your docs in real-time
126
-
-**Codebase Understanding**: Index your entire TypeScript project - functions, classes, interfaces, and exported const functions
126
+
-**Codebase Understanding**: Index your entire codebase - functions, classes, interfaces, and other language-specific entities
127
127
-**Internal Knowledge Base**: Index internal wikis, API docs, or technical documentation
128
128
-**Customer Support**: Provide accurate, context-aware answers from your documentation
129
129
-**Developer Onboarding**: Help new developers quickly find information in your codebase and documentation
@@ -135,7 +135,7 @@ See the [mimir-mcp README](./mimir-mcp/README.md) for detailed setup instruction
135
135
-**Node.js**: 20 or later
136
136
-**Supabase**: Vector store for embeddings and document storage
137
137
-**LLM Provider**: API key for OpenAI, Anthropic, Google, or Mistral
138
-
-**GitHub**: Repository with documentation (MDX) and/or TypeScript code to ingest (optional)
138
+
-**GitHub**: Repository with documentation (MDX) and/or code (TypeScript, Python, etc.) to ingest (optional)
Copy file name to clipboardExpand all lines: mimir-rag/README.md
+28-21Lines changed: 28 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
-
# mimir-rag
1
+
##mimir-rag
2
2
3
-
Utility CLI + API that ingests **documentation (MDX) and TypeScript codebases** into Supabase using **contextual RAG** and exposes OpenAI-compatible chat completions, MCP endpoints, and ingestion endpoints. Perfect for making your entire codebase and documentation queryable by AI assistants with rich contextual understanding.
3
+
Utility CLI + API that ingests **documentation (MDX) and codebases** into Supabase using **contextual RAG** and exposes OpenAI-compatible chat completions, MCP endpoints, and ingestion endpoints. It currently supports **TypeScript** and **Python** code, and is designed to be easily extensible to additional languages. Perfect for making your entire codebase and documentation queryable by AI assistants with rich contextual understanding.
You can configure separate repositories for TypeScript code and MDX documentation:
163
+
You can configure separate repositories for code and MDX documentation. Code repositories can contain TypeScript, Python, or any other supported language – the ingestion pipeline is language-agnostic at the repository level.
When configured, TypeScript files will be ingested from the code repository and MDX files from the docs repository. Source URLs for TypeScript files will automatically use the code repository URL.
180
+
When configured, code files will be ingested from the code repository and MDX files from the docs repository. Source URLs for code files will automatically use the code repository URL.
181
181
182
182
### Parser Configuration
183
183
@@ -191,27 +191,34 @@ Control what gets extracted from your codebase:
This contextual RAG approach allows the AI to understand not just the entity itself, but also how it fits into the larger codebase - what it imports, what it's part of, and how it's used. This enables more accurate and contextually-aware answers with direct links to source code.
221
+
This contextual RAG approach allows the AI to understand not just the entity itself, but also how it fits into the larger codebase—what it imports, what it's part of, and how it's used. This enables more accurate and contextually-aware answers with direct links to source code.
0 commit comments