A scalable document intelligence system with multi-tenant support, built with FastAPI, Kafka, PostgreSQL, and vector search capabilities.
- Multi-Tenant Architecture: Isolated tenant data and operations
- Document Processing: Upload, chunk, and embed documents
- Vector Search: Semantic search using embeddings (Gemini + Pinecone)
- Event-Driven: Kafka-based asynchronous processing
- Rate Limiting: Redis-based rate limiting per tenant
- Monitoring: Prometheus metrics and Grafana dashboards
- RESTful API: FastAPI with automatic documentation
- Backend: FastAPI, Python 3.11+
- Database: PostgreSQL (with asyncpg)
- Cache: Redis
- Message Queue: Apache Kafka
- Vector Database: Pinecone
- Embeddings: Google Gemini API
- Monitoring: Prometheus + Grafana
- Containerization: Docker & Docker Compose
- Docker Desktop or Docker Engine
- Docker Compose v3.8+
- API keys for Gemini and Pinecone
-
Clone the repository:
git clone <repository-url> cd Multi-Tenant-Document-Intelligence
-
Create environment file:
cd docker cp env.example .env -
Edit
.envfile and add your API keys:GEMINI_API_KEY=your_gemini_api_key PINECONE_API_KEY=your_pinecone_api_key PINECONE_INDEX_NAME=your_index_name
-
Start all services:
make up # Or: docker-compose up -d --build -
Check service status:
make status # Or: docker-compose ps -
View logs:
make logs # Or: docker-compose logs -f
- API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Grafana: http://localhost:3000 (admin/admin)
- Prometheus: http://localhost:9090
For detailed Docker setup instructions, see DOCKER_SETUP.md
- Python 3.11+
- PostgreSQL 15+
- Redis 7+
- Apache Kafka 2.8+
-
Create virtual environment:
python -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp env.example .env # Edit .env with your configuration -
Start PostgreSQL and Redis (using Docker or locally):
docker-compose up -d db redis kafka zookeeper
-
Run migrations:
alembic upgrade head
-
Start the application:
uvicorn app.main:app --reload
-
Start the worker (in another terminal):
python -m app.workers.v2.worker
Multi-Tenant-Document-Intelligence/
├── app/
│ ├── api/ # API routes
│ ├── core/ # Core configuration and settings
│ ├── db/ # Database models and sessions
│ ├── services/ # Business logic services
│ ├── utils/ # Utility functions
│ ├── workers/ # Kafka workers
│ └── main.py # FastAPI application
├── uploads/ # Uploaded documents
├── logs/ # Application logs
├── docker/ # Docker Compose, dockerfile and .env configuration
├── Makefile # Useful make commands
├── requirements.txt # Python dependencies
└── alembic.ini # Database migration config
make help # Show all available commands
make up # Start all services
make down # Stop all services
make restart # Restart all services
make logs # View logs
make logs-app # View app logs only
make logs-worker # View worker logs only
make migrate # Run database migrations
make shell-app # Access app container shell
make shell-db # Access database shell
make clean # Stop and remove everything
make rebuild # Rebuild and restartPOST /api/v1/tenants- Create a new tenantGET /api/v1/tenants/{tenant_id}- Get tenant details
POST /api/v1/uploads- Upload a document and ingestionGET /api/v1/uploads/get- List documents for a tenant
POST /api/v1/search/semantic- Semantic search across documents
GET /api/v1/health- Health checkGET /metrics- Prometheus metrics
Access Prometheus at http://localhost:9090
- URL: http://localhost:3000
- Username:
admin - Password:
admin
Pre-configured dashboards:
- FastAPI application metrics
- Worker performance metrics
- Kafka consumer metrics
See env.example for all available environment variables:
DATABASE_URL- PostgreSQL connection stringREDIS_URL- Redis connection stringKAFKA_BROKER- Kafka broker addressGEMINI_API_KEY- Google Gemini API key (required)PINECONE_API_KEY- Pinecone API key (required)PINECONE_INDEX_NAME- Pinecone index nameSECRET_KEY- JWT secret keyCHUNKING_STRATEGY- Document chunking strategy
Run migrations with Alembic:
# Create a new migration
alembic revision --autogenerate -m "description"
# Apply migrations
alembic upgrade head
# Rollback one migration
alembic downgrade -1With Docker:
make migrate
# Or: docker-compose exec app alembic upgrade headIf a port is already in use:
# Find and kill process
lsof -i :8000
kill -9 <PID>Ensure PostgreSQL is running:
docker-compose logs dbCheck Kafka status:
docker-compose logs kafka
docker-compose exec kafka kafka-topics --list --bootstrap-server localhost:9092