Retrieval-Augmented Generation application for document-based question answering. The system enables uploading documents, processing them into vector embeddings, and querying using natural language with LLM-powered responses.
- Architecture Overview
- Technology Stack
- Quick Start Guide
- Configuration
- Running the Application
- API Documentation
- Project Structure
- Development
- Troubleshooting
The application consists of three main components:
- FastAPI Backend - REST API server handling document ingestion, vector search, and chat streaming
- Streamlit Frontend - Web interface for document upload and interactive Q&A
- Inngest Workflows - Background job processing for async document ingestion
Data flow:
User Upload → Inngest Workflow → Document Parsing → Embedding Generation → Qdrant Storage
User Query → Embedding → Vector Search → Context Retrieval → LLM Response → Streaming Output
| Category | Technology | Purpose |
|---|---|---|
| Runtime | Python 3.13+ | Application runtime |
| Backend | FastAPI, Uvicorn | REST API server |
| Frontend | Streamlit | Web interface |
| Embeddings | Sentence Transformers | Text vectorization (all-MiniLM-L6-v2) |
| LLM | Groq API | Chat completions (Llama, Mixtral, DeepSeek) |
| Vector DB | Qdrant | Semantic search storage |
| Workflows | Inngest | Background job processing |
| Package Manager | uv | Fast dependency management |
Choose your operating system and follow the step-by-step instructions.
Ubuntu/Debian:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.13 python3.13-venv python3.13-devFedora:
sudo dnf install python3.13Arch Linux:
sudo pacman -S pythonVerify installation:
python3.13 --versioncurl -LsSf https://astral.sh/uv/install.sh | shRestart your terminal or run:
source ~/.bashrc # or ~/.zshrc if using zshVerify:
uv --versionUbuntu/Debian:
# Remove old versions
sudo apt remove docker docker-engine docker.io containerd runc
# Install dependencies
sudo apt update
sudo apt install ca-certificates curl gnupg
# Add Docker repository
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Run Docker without sudo (optional, requires logout/login)
sudo usermod -aG docker $USERFedora:
sudo dnf install docker docker-compose-plugin
sudo systemctl start docker
sudo systemctl enable dockerVerify:
docker --versioncurl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install nodejsVerify:
node --version
npm --version- Go to console.groq.com
- Create an account or log in
- Navigate to "API Keys" in the left sidebar
- Click "Create API Key"
- Copy the generated key (starts with
gsk_)
# Clone repository
git clone https://github.com/SebastianCielma/RAG.git
cd RAG
# Install dependencies
uv sync
# Create environment file
cat > .env << EOF
GROQ_API_KEY=gsk_your_api_key_here
EOFReplace gsk_your_api_key_here with your actual Groq API key.
Open 4 terminal windows/tabs in the RAG directory:
Terminal 1 - Start Qdrant:
docker run -d --name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrantTerminal 2 - Start Inngest:
npx inngest-cli@latest devTerminal 3 - Start Backend:
uv run uvicorn rag.main:app --reload --host 0.0.0.0 --port 8000Terminal 4 - Start Frontend:
uv run streamlit run frontend/app.py --server.port 8501Open your browser at: http://localhost:8501
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Follow the instructions to add Homebrew to your PATH.
brew install python@3.13Verify:
python3.13 --versioncurl -LsSf https://astral.sh/uv/install.sh | shRestart your terminal or run:
source ~/.zshrc # or ~/.bashrcVerify:
uv --version- Download Docker Desktop from docker.com/products/docker-desktop
- Open the downloaded
.dmgfile - Drag Docker to Applications folder
- Open Docker from Applications
- Wait for Docker to start (whale icon in menu bar becomes stable)
Verify:
docker --versionbrew install nodeVerify:
node --version
npm --version- Go to console.groq.com
- Create an account or log in
- Navigate to "API Keys" in the left sidebar
- Click "Create API Key"
- Copy the generated key (starts with
gsk_)
# Clone repository
git clone https://github.com/SebastianCielma/RAG.git
cd RAG
# Install dependencies
uv sync
# Create environment file
cat > .env << EOF
GROQ_API_KEY=gsk_your_api_key_here
EOFReplace gsk_your_api_key_here with your actual Groq API key.
Open 4 terminal windows/tabs (Cmd+T) in the RAG directory:
Terminal 1 - Start Qdrant:
docker run -d --name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrantTerminal 2 - Start Inngest:
npx inngest-cli@latest devTerminal 3 - Start Backend:
uv run uvicorn rag.main:app --reload --host 0.0.0.0 --port 8000Terminal 4 - Start Frontend:
uv run streamlit run frontend/app.py --server.port 8501Open your browser at: http://localhost:8501
- Go to python.org/downloads
- Download Python 3.13.x installer
- Run the installer
- Check "Add Python to PATH" at the bottom of the installer window
- Click "Install Now"
Verify in PowerShell or Command Prompt:
python --versionOpen PowerShell as Administrator and run:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"Close and reopen PowerShell.
Verify:
uv --version- Download Docker Desktop from docker.com/products/docker-desktop
- Run the installer
- Enable WSL 2 backend when prompted (recommended)
- Restart your computer when prompted
- After restart, open Docker Desktop and wait for it to start
Verify in PowerShell:
docker --versionNote: If Docker asks to enable WSL 2:
- Open PowerShell as Administrator
- Run:
wsl --install - Restart computer
- Open Docker Desktop again
- Go to nodejs.org
- Download LTS version installer
- Run the installer with default options
Verify in PowerShell:
node --version
npm --version- Go to git-scm.com/download/win
- Download and run installer
- Use default options
- Go to console.groq.com
- Create an account or log in
- Navigate to "API Keys" in the left sidebar
- Click "Create API Key"
- Copy the generated key (starts with
gsk_)
Open PowerShell and run:
# Clone repository
git clone https://github.com/SebastianCielma/RAG.git
cd RAG
# Install dependencies
uv sync
# Create environment file
@"
GROQ_API_KEY=gsk_your_api_key_here
"@ | Out-File -FilePath .env -Encoding utf8Open the .env file and replace gsk_your_api_key_here with your actual Groq API key.
Open 4 PowerShell windows, navigate to RAG directory in each (cd path\to\RAG):
PowerShell 1 - Start Qdrant:
docker run -d --name qdrant -p 6333:6333 -p 6334:6334 -v ${PWD}/qdrant_storage:/qdrant/storage qdrant/qdrantPowerShell 2 - Start Inngest:
npx inngest-cli@latest devPowerShell 3 - Start Backend:
uv run uvicorn rag.main:app --reload --host 0.0.0.0 --port 8000PowerShell 4 - Start Frontend:
uv run streamlit run frontend/app.py --server.port 8501Open your browser at: http://localhost:8501
All configuration is done through environment variables in the .env file:
| Variable | Required | Default | Description |
|---|---|---|---|
GROQ_API_KEY |
Yes | - | API key from console.groq.com |
QDRANT_URL |
No | http://localhost:6333 |
Qdrant server URL |
QDRANT_COLLECTION |
No | docs |
Collection name |
EMBED_MODEL |
No | all-MiniLM-L6-v2 |
Sentence Transformers model |
CHUNK_SIZE |
No | 1000 |
Text chunk size |
CHUNK_OVERLAP |
No | 200 |
Overlap between chunks |
LLM_TEMPERATURE |
No | 0.2 |
LLM response randomness |
After completing the Quick Start for your OS, you need to run 4 services:
| Service | Port | Command | Purpose |
|---|---|---|---|
| Qdrant | 6333 | docker run ... |
Vector database |
| Inngest | 8288 | npx inngest-cli@latest dev |
Workflow orchestration |
| Backend | 8000 | uv run uvicorn rag.main:app ... |
REST API |
| Frontend | 8501 | uv run streamlit run frontend/app.py ... |
Web interface |
Start order matters: Qdrant → Inngest → Backend → Frontend
- Press
Ctrl+Cin each terminal to stop services - Stop Qdrant container:
docker stop qdrant - Remove Qdrant container (optional):
docker rm qdrant
# Linux/macOS
docker start qdrant # Terminal 1
npx inngest-cli@latest dev # Terminal 2
uv run uvicorn rag.main:app --reload --port 8000 # Terminal 3
uv run streamlit run frontend/app.py --server.port 8501 # Terminal 4| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check |
| POST | /api/chat |
Streaming chat with RAG |
Interactive API docs available at: http://localhost:8000/docs
| Model | Speed | Quality | Best For |
|---|---|---|---|
| Llama 3.3 70B | Medium | Highest | Complex analysis |
| Llama 3.1 8B | Fast | Good | Quick queries |
| Mixtral 8x7B | Medium | High | Balanced use |
| DeepSeek R1 70B | Slow | Highest | Reasoning tasks |
| Qwen QWQ 32B | Medium | High | Reasoning |
RAG/
├── src/rag/ # Backend application
│ ├── main.py # FastAPI entry point
│ ├── core/ # Config, exceptions
│ ├── models/ # Pydantic schemas
│ ├── services/ # Business logic
│ ├── db/ # Qdrant client
│ └── workflows/ # Inngest functions
├── frontend/ # Streamlit UI
│ └── app.py
├── tests/ # Test suite
├── pyproject.toml # Project config
└── .env # Environment variables
uv sync --group devuv run pytestuv run ruff check . # Check for issues
uv run ruff check . --fix # Auto-fix issues
uv run ruff format . # Format codeuv run mypy src/- Linux:
sudo systemctl start docker - macOS/Windows: Open Docker Desktop application
Qdrant is not running. Start it with:
docker start qdrant
# or if container doesn't exist:
docker run -d --name qdrant -p 6333:6333 qdrant/qdrantDependencies not installed. Run:
uv sync- Make sure Inngest CLI is running (
npx inngest-cli@latest dev) - Make sure Backend is running
- Check http://127.0.0.1:8288 - functions should appear under "Apps"
The embedding model (~100MB) downloads on first run. This is normal and cached for subsequent runs.
- Check
.envfile has correct key - Key should start with
gsk_ - No quotes around the key
- Verify at console.groq.com
Change port in the startup command:
- Backend:
--port 8001instead of--port 8000 - Frontend:
--server.port 8502instead of--server.port 8501
License: MIT