Documentation Agent

RAG based application to download documentation from a website, create a vector store, and answer user queries with the help of the vector store. It employs advanced RAG techniques like Contextual RAG and Graph RAG.
It is built on open source libraries and free APIs, to be easily accessible to anyone at no cost.
It is built using llamaindex as the core framework.
Vector database - Chromadb for local, Pinecone for deployment. Can be easily changed.
Graph Databse - Neo4j is used as the graph database.
LLM - It is configured to use free llms from gemini / cerebras / groq etc. Also Ollama can be used for local usage.
Embeddings - It can be used with google embeddings (free and recommended) or any open source embedding from huggingface using fastembed library (local)

Simple Example

Quickly understand about a simple RAG pipeline: notebooks/simple_rag.ipynb

Installation

Prerequisites

Python 3.11 or higher
Node.js and npm (for frontend)

Option 1: Using uv (Recommended)

uv is a fast Python package manager and project manager. It's much faster than pip and handles virtual environments automatically.

1. Install uv

# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

# Alternative: using pip
pip install uv

2. Clone the repository

git clone https://github.com/Sparshsing/documentation_agent.git
cd documentation_agent

3. Install dependencies

# This creates a virtual environment and installs all dependencies
uv venv --python 3.11  # or whichever python version above 3.11 you prefer
uv sync
# optional - create detailed requirements file
uv pip compile pyproject.toml -o requirements.txt

4. Activate the virtual environment

# On Linux/macOS
source .venv/bin/activate

# On Windows (PowerShell)
.venv\Scripts\Activate.ps1

# On Windows (Command Prompt)
.venv\Scripts\activate.bat

Option 2: Using pip (Traditional method)

1. Clone the repository

git clone https://github.com/Sparshsing/documentation_agent.git
cd documentation_agent

2. Create and activate a virtual environment

# Create virtual environment
python -m venv .venv

# Activate the environment
# On Linux/macOS
source .venv/bin/activate

# On Windows (PowerShell)
.venv\Scripts\Activate.ps1

# On Windows (Command Prompt)
.venv\Scripts\activate.bat

3. Install dependencies

pip install -r requirements.txt

Running the Application

Create .env file in project directory

Add API Keys, and other env variables. refer env_example.txt

Create an index

Download the data in data directory. Use download_documentation.py to download from website or github. eg. run this in a .py file (may not work in jupyter notebook)

from download_documentation import download_docs_from_website, save_all_content_to_file

url = 'https://ai.google.dev/api?lang=python'
download_path = 'data/google_genai/api'

asyncio.run(download_docs_from_website(url, download_path))
save_all_content_to_file(download_path, os.path.join(download_path, '__all_docs__.md'))

Edit core/create_vector_store.py to input the index name, data path, choose metadata extractor, change settings etc

python core/create_vector_store.py

The index data is saved in processed_data/chromadb, other index files saved in processed_data/index_name directory.
Now you can Query the index by running backend and frontend.

Backend

# From the project root directory
python backend/run_server.py

Frontend

# From the project root directory
cd frontend
npm install  # First time only
npm run dev

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github/workflows		.github/workflows
backend		backend
core		core
frontend		frontend
notebooks		notebooks
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
README_DEPLOY.md		README_DEPLOY.md
docker-compose.yml		docker-compose.yml
env_example.txt		env_example.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Documentation Agent

Simple Example

Installation

Prerequisites

Option 1: Using uv (Recommended)

1. Install uv

2. Clone the repository

3. Install dependencies

4. Activate the virtual environment

Option 2: Using pip (Traditional method)

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

Running the Application

Create .env file in project directory

Create an index

Backend

Frontend

About

Uh oh!

Releases

Packages

Languages

Sparshsing/documentation_agent

Folders and files

Latest commit

History

Repository files navigation

Documentation Agent

Simple Example

Installation

Prerequisites

Option 1: Using uv (Recommended)

1. Install uv

2. Clone the repository

3. Install dependencies

4. Activate the virtual environment

Option 2: Using pip (Traditional method)

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

Running the Application

Create .env file in project directory

Create an index

Backend

Frontend

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages