Skip to content

mattcurf/rag_play

Repository files navigation

rag_play

A hands-on sandbox to explore Retrieval-Augmented Generation (RAG): ingest, index, inspect, visualize, and query your documents.

Educational only These scripts favor clarity over resiliency, security, and edge-cases. I don't intend for them to be ready for production-quality deployments.

sample image

Why this repo exists

RAG systems can feel opaque. This repo breaks the pipeline into small, runnable scripts so you can see exactly what happens at each step:

  • Ingest documents → split → embed → store in ChromaDB
  • Inspect what got stored (documents, metadatas, vectors)
  • Visualize embeddings with UMAP to build intuition
  • Query the store and view retrieved context for LLM prompting

What this is / isn’t

Is:

  • A learn-by-doing toolkit to demystify RAG mechanics
  • Small, focused Python scripts with clear flags and outputs
  • A starting point for your own experiments

Isn’t:

  • Production ready (no auth, no robust error handling, minimal tests)
  • Optimized for performance or cost
  • A full RAG framework or service

Quick start

Create a Python environment (recommended: venv)

python3 -m venv .venv
source .venv/bin/activate  
python -m pip install --upgrade pip

Install dependencies, models, and some same datafiles

pip install -r requirements.txt
hf download bartowski/Meta-Llama-3.1-8B-Instruct-GGUF Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf --local-dir .
hf download nomic-ai/nomic-embed-text-v1.5-GGUF nomic-embed-text-v1.5.f16.gguf --local-dir .
wget -O the-us-constitution.pdf https://www.lexisnexis.com/supp/lawschool/resources/the-us-constitution.pdf
wget -O comprehensive-rust.pdf https://google.github.io/comprehensive-rust/comprehensive-rust.pdf
wget -O honda_2020_fit_manual.pdf https://techinfo.honda.com/rjanisis/pubs/OM/AH/AT5A2020OM/enu/AT5A2020OM.PDF

Usage

A) Ingest documents → Chroma

ingest_chroma.py

Splits input docs, computes embeddings, and upserts into a persistent Chroma collection.

python ingest_chroma.py --rebuild --db ./database/chroma_db1 --embed-model nomic-embed-text-v1.5.f16.gguf

B) Inspect what’s in Chroma

inspect_chroma.py

View collections, counts, and sample items (without pulling giant vectors).

python inspect_chroma.py --db ./database/chroma_db1 --limit 9999 --include-embeddings

C) Visualize embeddings with UMAP

*visualize_umap.py

Projects high-dimensional vectors to 2D with UMAP and saves a PNG. You can color points by a metadata key (e.g., source file).

python visualize_umap.py --db ./database/chroma_db1 --out plot.png --label-key source

D) Query the store (simple RAG)

query_rag.py

Retrieves top-K chunks and then asks an LLM to draft an answer.

python inference_rag.py  --db ./database/chroma_db1 --llm-model ./Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf --embed-model ./nomic-embed-text-v1.5.f16.gguf --threads 24 --ctx 2048 --max-tokens 1024 --show-retrieved --use-reranker --reranker-top-n 4 --gpu-layers 99

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages