Skip to content

peterschenk01/rag-system

Repository files navigation

Retrieval-Augmented Generation (RAG) System

A Retrieval-Augmented Generation (RAG) chatbot built in Python using FAISS for vector similarity search and Ollama for embeddings and LLM inference.


Project Overview

Retrieval-Augmented Generation (RAG) combines classical information retrieval with large language models. Instead of relying solely on the LLM’s internal knowledge, the system retrieves relevant chunks from an external corpus and injects them as context for generation.

This project includes:

  • Dataset ingestion and text chunking
  • Embedding generation via Ollama
  • Vector indexing and similarity search using FAISS
  • Persistent FAISS index with manifest-based validation
  • Interactive terminal-based chatbot
  • Unit tests for each module
  • Continuous Integration (CI) workflow
  • Pre-commit hooks

Technology Stack

  • FAISS — vector similarity search
  • Ollama — embeddings and LLM inference
  • UV — dependency and environment management
  • Pytest — testing
  • Ruff — linting and formatting
  • Pre-commit — local enforcement of quality checks

Quickstart

1. Clone the repository

git clone https://github.com/peterschenk01/rag-system.git
cd rag-system

2. Install dependencies (UV)

uv sync

3. Ollama Setup

Install Ollama: https://ollama.com/download

Pull the models used by the system:

ollama pull hf.co/CompendiumLabs/bge-base-en-v1.5-gguf
ollama pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF

4. Dataset

Example dataset (cat facts):

mkdir -p data
curl -L -o data/cat-facts.txt https://huggingface.co/ngxson/demo_simple_rag_py/resolve/main/cat-facts.txt

5. Running the Chatbot

uv run rag-system

Development

Install development dependencies

uv sync --dev

Pre-commit hooks

uv run pre-commit install

Pre-commit runs formatting, linting, and other checks automatically before commits.


Linting & formatting (Ruff)

uv run ruff check .
uv run ruff format .

Running tests

uv run pytest

Continuous Integration

A CI workflow is included to ensure:

  • Tests pass
  • Ruff linting succeeds
  • Code quality matches local pre-commit checks

License

This project is licensed under the MIT License.

About

A Retrieval-Augmented Generation (RAG) System built in Python using FAISS for vector search and Ollama for embeddings and LLM inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages