gitRAG

RAG-based GitHub Repo Analysis Platform
Analyse any public GitHub repository with LLM-powered chat and advanced semantic search.

Untitled.video.-.Made.with.Clipchamp.2.mp4

⭐ Overview

Situation

As a participant in open-source competitions and project exhibitions (EPICS, university projects), I often struggled to deeply understand large codebases—especially when onboarding new repositories from group members or exploring unfamiliar open-source projects. Sifting through thousands of files, dependencies, and scattered documentation was tedious and overwhelming, making it hard to answer even basic questions like "Where is X implemented?" or "How does this module work?"

Task

I needed a platform that would let me:

Instantly chat with any GitHub repo to ask questions about code, architecture, or logic.
Quickly visualize and explore repo structure, file contents, and metadata.
Perform semantic code search (not just by filename/text).
Support multiple users and projects securely for my team and in competitions.

Action

I independently designed and built gitRAG—an end-to-end, multi-tenant platform that ingests any public GitHub repo, chunks and indexes its code using embeddings and vector search, and enables users to interactively chat, search, and analyse codebases using a modern LLM (via LangChain and OpenAI API).

Built secure, scalable backend using FastAPI, PostgreSQL (Aiven), PineconeDB, and LangChain.
Developed a modern React frontend with hierarchical file explorer, real-time AI chat, and repo analytics.
Integrated Google/GitHub OAuth2 for authentication, and per-user encrypted API key management for privacy.
Engineered ingestion pipelines to chunk, embed, and index 50MB+ codebases with 10,000+ files.
Tested and deployed the platform on multiple real-world repos for open-source events and university project groups.

Result

Significantly reduced onboarding time for new repositories—now get context, explanations, and code Q&A in seconds.
Enabled my team and myself to confidently tackle larger, more complex projects in hackathons and coursework.
gitRAG is now a robust, reusable tool for anyone needing rapid understanding of unfamiliar codebases.

🚀 Features

LLM-powered code chat: Ask questions about repo structure, functions, or files—get contextual, AI-driven answers.
Semantic code search: Find relevant code snippets using meaning, not just keywords.
Hierarchical file explorer: Browse and preview the full repo tree with metadata and analytics.
Multi-user & multi-repo support: Secure, per-user data isolation with Google/GitHub OAuth2.
Repo analytics: Visualize language breakdown, file types, contributors, and more.
Encrypted API key management: User API keys are encrypted and never exposed.
Blazing fast: Sub-second query responses (vector search and retrieval).
Modern UI: Built with React, TailwindCSS, and Three.js (for 3D hero effect).

🛠️ Tech Stack

Frontend: React.js, TailwindCSS, Vite, Three.js
Backend: FastAPI (Python), LangChain, PostgreSQL (Aiven), PineconeDB
AI/Vector Search: OpenAI API, PineconeDB, LangChain
Auth: Google OAuth2, GitHub OAuth2
Integrations: GitHub API (repo fetching, metadata), Node.js (utility scripts)

📷 Demo

⚡ How it Works (RAG Pipeline)

Login with Google or GitHub OAuth2 (secure, per-user).
Paste any public GitHub repo URL and your OpenAI API key (encrypted).
Ingestion:
- Fetches repo files via GitHub API
- Chunks code using custom logic (by file type/size)
- Generates vector embeddings (LangChain + OpenAI API)
- Stores chunks and metadata in PineconeDB and PostgreSQL
Analysis & Chat:
- Use AI chat to ask any question about the repo (“What does X function do?” “Show me auth logic”)
- Semantic search finds and retrieves the most relevant code chunks
- LLM (via LangChain) generates contextual, accurate answers using retrieved code
Explore:
- Hierarchical explorer shows real file tree, lets you preview content and metadata
- Repo analytics panel for high-level insights

🧩 Architecture

✨ Example Use Cases

Hackathons/open-source events: Instantly understand any team repo or competition project.
University coursework: Quickly onboard and analyze group project submissions.
Personal learning: Explore popular open-source projects by chatting and searching their code.
Team code reviews: Get instant explanations and context for PRs and legacy code.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
backend		backend
frontend		frontend
Dockerfile		Dockerfile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

gitRAG

⭐ Overview

Situation

Task

Action

Result

🚀 Features

🛠️ Tech Stack

📷 Demo

⚡ How it Works (RAG Pipeline)

🧩 Architecture

✨ Example Use Cases

About

Uh oh!

Releases

Packages

Languages

ATOMworkplace/gitRAG

Folders and files

Latest commit

History

Repository files navigation

gitRAG

⭐ Overview

Situation

Task

Action

Result

🚀 Features

🛠️ Tech Stack

📷 Demo

⚡ How it Works (RAG Pipeline)

🧩 Architecture

✨ Example Use Cases

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages