Retrieval-Augmented Generation backend for compliance teams. Upload PDF evidence, embed it into a Supabase-backed vector store, audit responses with OpenAI (standard and streaming), and track gaps, sessions, and reports in one FastAPI service. Security features include hardened uploads, rate limiting, idempotency, and detailed audit logging.
- Compliance-ready RAG pipeline: ingest policies, procedures, and evidence; question them in seconds.
- Security-first ingestion: ClamAV scanning, strict validation, idempotent uploads, and per-IP/user throttling.
- Managed Supabase vector store with metadata filters, compliance domains, and audit session scoping.
- Conversation-aware Q&A with streaming support, persisted history, and audit session linkage.
- Role-based access (JWT) plus comprehensive audit logs, compliance gaps, and executive summaries out of the box.
- FastAPI + Uvicorn for the ASGI web layer with rich OpenAPI docs.
- LangChain + OpenAI for embeddings, retrieval orchestration, and answer generation.
- Supabase (Postgres + pgvector) accessed via supabase-py/PostgREST for documents and metadata.
- SQLModel + Pydantic for strongly typed entities, schemas, and validation.
- SlowAPI, custom middleware, and structured logging for rate limiting, observability, and security hardening.
- Optional ClamAV (
clamd) and PikePDF for file scanning and PDF introspection.
- Python 3.13+ (virtual environment recommended).
- Supabase project (or Postgres with pgvector enabled) containing the tables below.
- OpenAI API key with access to the chosen chat and embedding models.
- Optional: ClamAV daemon reachable by
clamdfor malware scanning during ingestion.
Create a .env file in the repository root. Defaults in config/config.py apply when a variable is omitted.
# Supabase API
SUPABASE_URL=your_supabase_url
SUPABASE_KEY=your_supabase_service_role_key
SUPABASE_TABLE_DOCUMENTS=documents
SUPABASE_TABLE_CHAT_HISTORY=chat_history
SUPABASE_TABLE_PDF_INGESTION=pdf_ingestion
SUPABASE_TABLE_COMPLIANCE_DOMAINS=compliance_domains
SUPABASE_TABLE_COMPLIANCE_GAPS=compliance_gaps
SUPABASE_TABLE_AUDIT_SESSIONS=audit_sessions
SUPABASE_TABLE_USERS=users
SUPABASE_TABLE_AUDIT_REPORTS=audit_reports
SUPABASE_TABLE_AUDIT_REPORT_VERSIONS=audit_report_versions
SUPABASE_TABLE_AUDIT_REPORT_DISTRIBUTIONS=audit_report_distributions
SUPABASE_TABLE_AUDIT_SESSION_PDF_INGESTIONS=audit_session_pdf_ingestions
SUPABASE_TABLE_AUDIT_LOG=audit_log
SUPABASE_TABLE_ISO_CONTROLS=iso_controls
# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-3.5-turbo
EMBEDDING_MODEL=text-embedding-ada-002
# Authentication / Roles
JWT_SECRET_KEY=your_jwt_secret
JWT_ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30
REFRESH_TOKEN_EXPIRE_DAYS=7
VALID_USER_ROLES=admin,compliance_officer,reader # optional override
DEFAULT_USER_ROLE=reader
# RAG / Ingestion
TOP_K=5
PDF_DIR=pdfs/
REPORTS_DIR=reports/
PDF_QUARANTINE_DIR=/tmp/pdf_quarantine
RATE_LIMIT_ENABLED=true
RATE_LIMIT_STORAGE_URI=redis://localhost:6379/0 # optional if using shared limiter backend
# Optional hardening
# RATE_LIMIT_ENABLED=false # disable SlowAPI limiter (not recommended)
# PDF_QUARANTINE_DIR=/secure/quarantine
# Add other feature flags as needed.Enable pgvector (and pgcrypto for UUID generation) then create the core tables. Adjust names to match your environment and keep vector dimensions aligned with the embedding model (e.g., 1536 for text-embedding-ada-002).
-- Extensions
create extension if not exists vector;
create extension if not exists pgcrypto;
-- Vectorized document chunks
create table if not exists public.documents (
id uuid primary key default gen_random_uuid(),
content text not null,
embedding vector(1536) not null,
compliance_domain text,
document_version text,
document_tags text[] default array[]::text[],
approval_status text,
source_filename text,
source_page_number integer,
chunk_index integer,
uploaded_by uuid,
approved_by uuid,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now(),
updated_at timestamptz not null default now()
);
create index if not exists documents_embedding_idx on public.documents using ivfflat (embedding vector_cosine_ops);
create index if not exists documents_domain_idx on public.documents (compliance_domain);
-- Chat history with conversation linkage
create table if not exists public.chat_history (
id bigserial primary key,
conversation_id uuid not null,
question text not null,
answer text not null,
audit_session_id uuid,
compliance_domain text,
source_document_ids uuid[] default array[]::uuid[],
match_threshold numeric(5,4),
match_count integer,
user_id uuid,
total_tokens_used integer,
response_time_ms integer,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create index if not exists chat_history_conversation_idx on public.chat_history (conversation_id, created_at desc);
-- PDF ingestion metadata
create table if not exists public.pdf_ingestion (
id uuid primary key default gen_random_uuid(),
filename text not null,
metadata jsonb not null default '{}'::jsonb,
ingested_at timestamptz not null default now()
);
-- Compliance catalog tables (minimal columns shown; extend as needed)
create table if not exists public.compliance_domains (
id uuid primary key default gen_random_uuid(),
name text not null unique,
description text,
created_at timestamptz not null default now()
);
create table if not exists public.audit_sessions (
id uuid primary key default gen_random_uuid(),
user_id uuid not null,
session_name text not null,
compliance_domain text not null,
is_active boolean not null default true,
total_queries integer not null default 0,
session_summary text,
audit_report text,
started_at timestamptz not null default now(),
ended_at timestamptz,
ip_address text,
user_agent text,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now(),
updated_at timestamptz not null default now()
);
create table if not exists public.compliance_gaps (
id uuid primary key default gen_random_uuid(),
audit_session_id uuid not null references public.audit_sessions(id) on delete cascade,
user_id uuid not null,
gap_type text not null,
gap_category text,
gap_title text not null,
gap_description text,
original_question text,
chat_history_id uuid,
pdf_ingestion_id uuid,
detection_method text,
confidence_score numeric(5,4),
risk_level text,
business_impact text,
status text not null default 'identified',
assigned_to uuid,
due_date timestamptz,
recommendation_text text,
recommended_actions text[] default array[]::text[],
related_documents text[] default array[]::text[],
detected_at timestamptz not null default now(),
resolution_notes text,
auto_generated boolean not null default true,
ip_address text,
user_agent text,
session_context jsonb not null default '{}'::jsonb,
metadata jsonb not null default '{}'::jsonb
);
create table if not exists public.audit_log (
id uuid primary key default gen_random_uuid(),
object_type text not null,
object_id text not null,
action text not null,
user_id uuid not null,
audit_session_id uuid,
compliance_domain text,
performed_at timestamptz not null default now(),
ip_address text,
user_agent text,
details jsonb not null default '{}'::jsonb,
risk_level text,
tags text[] default array[]::text[]
);
create table if not exists public.iso_controls (
id uuid primary key default gen_random_uuid(),
control_id text not null,
title text not null,
description text,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);Dimension note: If you swap embeddings (e.g., move to text-embedding-3-large with 3072 dims), update vector(<dims>), reindex documents_embedding_idx, and re-embed stored vectors.
- Create a virtual environment:
python3 -m venv .venv && source .venv/bin/activate. - Install the project (and dependencies):
pip install -e .for runtime,pip install -e ".[dev]"for tooling. - Populate
.envand ensure the Supabase/Postgres schema exists. - Start the API with
make runoruvicorn app:app --reload. - Visit
http://localhost:8000/docsfor interactive OpenAPI documentation.
POST /v1/auth/signup/POST /v1/auth/login— User registration, login, and token refresh.POST /v1/ingestions/upload— Secure PDF upload with malware scanning, metadata, and embedding ingestion.GET /v1/documents— Paginated document listing with rich filtering and tag helpers.POST /v1/rag/query— Non-streaming compliance Q&A with audit session and domain filters.POST /v1/rag/query-stream— Streaming Q&A; response headers includex-conversation-idfor chat continuity.GET /v1/history/{conversation_id}— Retrieve prior Q&A turns for a conversation.GET /v1/compliance-gaps,POST /v1/compliance-gaps— Manage detected compliance gaps and recommendations.GET /v1/audit-sessions/POST /v1/audit-sessions— Create and monitor audit sessions and document links.GET /v1/executive-summary,/v1/threat-intelligence,/v1/risk-prioritization,/v1/target-audience— AI-generated compliance summaries and insights for stakeholders.
Upload a PDF (requires JWT bearer token and Idempotency-Key header):
curl -X POST http://localhost:8000/v1/ingestions/upload \
-H "Authorization: Bearer $TOKEN" \
-H "Idempotency-Key: $(uuidgen)" \
-F "file=@/path/to/policy.pdf" \
-F "compliance_domain=ISO27001" \
-F "document_tags=reference_document,current"Ask a compliance question (non-streaming):
curl -X POST http://localhost:8000/v1/rag/query \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{
"question": "What controls cover incident response?",
"compliance_domain": "ISO27001",
"match_count": 5
}'Ask a question with streaming output (capture x-conversation-id header):
curl -N -X POST http://localhost:8000/v1/rag/query-stream \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-H "Idempotency-Key: $(uuidgen)" \
-d '{
"question": "Summarize open compliance gaps",
"conversation_id": null,
"match_threshold": 0.7
}' -i.
├── app.py # FastAPI application, middleware, router wiring
├── api/ # Route definitions (auth, rag, ingestion, audit, compliance, etc.)
├── auth/ # JWT handling, decorators, and token schemas
├── common/ # Logging, exceptions, validation, responses, middleware
├── config/ # Pydantic settings, CORS configuration, tagging metadata
├── db/ # Supabase client factory
├── entities/ # Domain models used across services and repositories
├── repositories/ # Supabase-backed repositories for each resource
├── services/ # Business logic (RAG, ingestion, audit, summaries, risk, reports)
├── adapters/ # Integration helpers and external service adapters
├── tools/ # CLI utilities and maintenance scripts
├── Makefile # Convenience commands (e.g., `make run`)
├── pyproject.toml # Project metadata and dependency definitions
└── README.md
- CORS origins: update the allowed origins list in
config/cors.pyfor your frontend domains. - Chunking & embeddings: adjust chunk size/overlap and embeddings in
services/ingestion_service.pyandservices/vector_store.py. - Model selection: override
OPENAI_MODEL/EMBEDDING_MODELvia.env; ensure the embedding dimension matches the database schema. - Retrieval hyperparameters: tweak
TOP_K,match_threshold,match_count, and tag filters per use case. - Table overrides: change Supabase table names via the corresponding
SUPABASE_TABLE_*environment variables. - Security policies: align Supabase RLS policies with the API’s role model; service-role key recommended for server-side usage.
- Add new endpoints: create routers under
api/and include them inapp.py; define request/response models inservices/schemas.py. - Business logic: extend services in
services/and inject them via dependency providers independencies.py. - Repositories: implement Supabase calls in
repositories/; leverage the base repository for filtering and pagination. - Migrations: adopt Alembic or Supabase migration tooling to evolve tables; track schema changes alongside application code.
- Testing: factor logic into services; write async-friendly tests with pytest, using mocked Supabase/OpenAI clients.
- Observability: enrich structured logs in
common/logging.pyor hook in tracing/metrics (e.g., OpenTelemetry) as needed.
- Supabase connectivity: run
services.db_check.check_database_connection()(exposed via the health endpoint) to confirm API credentials and table names. - Vector search mismatches: ensure stored embeddings and query embeddings share the same model/dimension; re-index after schema changes.
- Upload failures: verify ClamAV (
clamd) is accessible, PDF size < 50 MB, and provide a validIdempotency-Keyheader. - Rate limit hits: SlowAPI enforces defaults (
200/minute,10/minuteon RAG); tuneRATE_LIMIT_*env vars or provide Redis storage for clusters. - RLS denials: service-role keys bypass RLS; otherwise add Supabase policies allowing the API role to read/write required tables.
- Streaming stalls: check reverse proxies and ensure clients use
curl -Nor equivalent to keep the connection open.