ClaraVector

Lightweight vector document storage and semantic search API. Designed for edge deployment (Raspberry Pi, small VPS) with NVIDIA NIM embeddings.

Features

Multi-user - Isolated user data with notebooks
Any Document - PDF, DOCX, PPTX, HTML, TXT, MD, CSV, JSON
Semantic Search - Query by meaning, not keywords
Rate-Limited Queue - Handles NIM's 40 RPM limit gracefully
Low Resource - Runs on 2GB RAM, ~2GB storage for 100 users

Quick Start

Docker (Recommended)

# Clone and configure
git clone https://github.com/claraverse-space/ClaraVector.git
cd ClaraVector
cp .env.example .env

# Add your NVIDIA NIM API key to .env
# Get one at: https://build.nvidia.com/

# Run
docker compose up -d

# Verify
curl http://localhost:8000/api/v1/health

Manual

python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your NIM API key
python scripts/init_db.py
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000

Usage

# 1. Create user
curl -X POST http://localhost:8000/api/v1/users \
  -H "Content-Type: application/json" \
  -d '{"user_id": "demo"}'

# 2. Create notebook
curl -X POST http://localhost:8000/api/v1/users/demo/notebooks \
  -H "Content-Type: application/json" \
  -d '{"name": "Research"}'
# Returns: {"notebook_id": "abc-123", ...}

# 3. Upload document
curl -X POST http://localhost:8000/api/v1/notebooks/abc-123/documents \
  -F "[email protected]"
# Returns: {"document_id": "xyz-789", "status": "pending", ...}

# 4. Check processing (poll until "completed")
curl http://localhost:8000/api/v1/documents/xyz-789/status

# 5. Search
curl -X POST http://localhost:8000/api/v1/notebooks/abc-123/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is machine learning?", "top_k": 5}'

Deploy

Railway / Render / Fly.io

# Uses included Dockerfile
# Set environment variable: NIM_API_KEY=your-key

See railway.json for Railway-specific config.

Raspberry Pi

# Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER

# Run
docker compose up -d

Documentation

Doc	Description
API Reference	Complete endpoint documentation
Agent Integration	Guide for AI/LLM integration
Swagger UI	Interactive API explorer

Tech Stack

Component	Choice	Why
API	FastAPI	Async, lightweight
Vectors	LanceDB	Embedded, no server needed
Metadata	SQLite	Zero config
Embeddings	NVIDIA NIM	1024-dim, high quality
Parsing	PyMuPDF, python-docx	Robust extraction

Configuration

Environment variables (.env):

# Required
NIM_API_KEY=nvapi-xxx

# Optional (defaults shown)
NIM_BASE_URL=https://integrate.api.nvidia.com/v1
NIM_MODEL=nvidia/nv-embedqa-e5-v5
NIM_RPM_LIMIT=40
HOST=0.0.0.0
PORT=8000
MAX_FILE_SIZE_MB=10

# CORS - Configure for production!
CORS_ORIGINS=*                    # Or: https://myapp.com,https://admin.myapp.com
CORS_ALLOW_CREDENTIALS=true
CORS_ALLOW_METHODS=*              # Or: GET,POST,DELETE
CORS_ALLOW_HEADERS=*

CORS Examples

# Development (allow all)
CORS_ORIGINS=*

# Single frontend
CORS_ORIGINS=https://myapp.com

# Multiple origins
CORS_ORIGINS=https://myapp.com,https://admin.myapp.com,http://localhost:3000

# Restricted methods
CORS_ALLOW_METHODS=GET,POST,DELETE

Resource Usage

Users	Docs	Vectors	Storage
10	100	~3K	~200MB
100	2000	~60K	~2GB
500	10000	~300K	~8GB

Memory: ~500MB idle, ~1GB under load

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
docs		docs
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
fly.toml		fly.toml
nginx.conf		nginx.conf
railway.json		railway.json
render.yaml		render.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ClaraVector

Features

Quick Start

Docker (Recommended)

Manual

Usage

Deploy

Railway / Render / Fly.io

Raspberry Pi

Documentation

Tech Stack

Configuration

CORS Examples

Resource Usage

License

About

Uh oh!

Releases

Packages

Languages

claraverse-space/ClaraVector

Folders and files

Latest commit

History

Repository files navigation

ClaraVector

Features

Quick Start

Docker (Recommended)

Manual

Usage

Deploy

Railway / Render / Fly.io

Raspberry Pi

Documentation

Tech Stack

Configuration

CORS Examples

Resource Usage

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages