| Tool | Version | Check |
|---|---|---|
| Python | 3.12+ | python3 --version |
| Node.js | 20+ | node --version |
| npm | 10+ | npm --version |
| Git | 2.30+ | git --version |
# Clone the repository
git clone https://github.com/CheckMyData-AI/checkmydata-ai.git
cd checkmydata-ai
# Install everything (Python venv, Node deps, .env, migrations)
make setup
# Start development servers
make devThis gives you:
- Backend API at
http://localhost:8000 - Frontend at
http://localhost:3100
cd backend
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # macOS/Linux
# .venv\Scripts\activate # Windows
# Install dependencies (including dev tools)
pip install -e ".[dev]"cd frontend
npm install# Copy example env file
cp backend/.env.example backend/.envEdit backend/.env and set required values:
# Required: encryption key for stored credentials
# Generate: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
MASTER_ENCRYPTION_KEY=<your-key>
# Required: at least one LLM provider API key
OPENAI_API_KEY=sk-...
# Or: ANTHROPIC_API_KEY=sk-ant-...
# Or: OPENROUTER_API_KEY=sk-or-...
# Optional: Google OAuth for "Sign in with Google"
GOOGLE_CLIENT_ID=<your-client-id>.apps.googleusercontent.com
# Optional: change JWT secret for production
JWT_SECRET=<random-secret>cd backend
PYTHONPATH=. alembic upgrade headOr use the Makefile:
make migratemake devOr start individually:
# Backend
cd backend && uvicorn app.main:app --reload --port 8000
# Frontend (separate terminal)
cd frontend && npm run dev -- --port 3100# Start with Docker Compose
docker compose up -d
# Or use the helper script
make docker-upThe docker-compose.yml runs both backend and frontend containers.
Copy backend/.env.example to backend/.env. All available variables:
| Variable | Required | Description |
|---|---|---|
MASTER_ENCRYPTION_KEY |
Yes | Fernet key for encrypting stored credentials. Generate: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" |
JWT_SECRET |
Yes (prod) | Secret for signing JWT tokens. Generate: python -c "import secrets; print(secrets.token_urlsafe(32))" |
DATABASE_URL |
No | Default: sqlite+aiosqlite:///./data/agent.db. For production: postgresql+asyncpg://... |
OPENAI_API_KEY |
One of three | OpenAI API key (for GPT-4o, etc.) |
ANTHROPIC_API_KEY |
One of three | Anthropic API key (for Claude) |
OPENROUTER_API_KEY |
One of three | OpenRouter API key (multi-model proxy) |
GOOGLE_CLIENT_ID |
No | Google OAuth Client ID from Google Cloud Console. Enables "Sign in with Google". |
NEXT_PUBLIC_GOOGLE_CLIENT_ID |
No | Same value as above, set in frontend/.env.local for the GIS JavaScript SDK. |
RESEND_API_KEY |
No | Resend API key for transactional emails (welcome, invite, acceptance). If empty, emails are silently skipped. |
RESEND_FROM_EMAIL |
No | Sender address for emails (default: CheckMyData <noreply@checkmydata.ai>). Must match a verified domain in Resend. |
APP_URL |
No | Frontend URL for email links (default: http://localhost:3000). Set to production URL in prod. |
CORS_ORIGINS |
No | JSON array of allowed origins |
ENVIRONMENT |
No | Set to production for strict secret validation |
REDIS_URL |
No | Enables shared cache + ARQ task queue. Empty = in-process fallback. |
JWT_EXPIRE_MINUTES |
No | Token expiry (default: 1440 = 24h) |
CHROMA_SERVER_URL |
No | Remote ChromaDB server URL. If empty, uses embedded PersistentClient. |
MAX_HISTORY_TOKENS |
No | Token budget for chat history before summarization (default: 4000) |
MAX_CONTEXT_TOKENS |
No | Total context window budget (default: 16000) |
LOG_FORMAT |
No | text (default) or json (for production log aggregation) |
LOG_LEVEL |
No | DEBUG, INFO (default), WARNING, ERROR |
See backend/.env.example for the complete list of optional/advanced settings (ChromaDB, session rotation, orchestrator limits, DB index, pipeline, streaming, backup, token budgets, connection pool).
The production environment runs on Heroku as two Docker container apps with Heroku Postgres.
Auto-deploy (CI/CD): Every push to main triggers deployment via GitHub Actions (.github/workflows/deploy.yml): CI runs → builds Docker images → pushes to Heroku Container Registry → releases → health checks.
Manual deploy: Use ./scripts/deploy-heroku.sh when CI/CD is unavailable. The script requires HEROKU_API_KEY and GOOGLE_CLIENT_ID environment variables.
./scripts/deploy-heroku.sh # Deploy both
./scripts/deploy-heroku.sh --backend-only
./scripts/deploy-heroku.sh --frontend-onlySetting up a new Heroku deployment:
heroku create <your-api-app> --stack container
heroku create <your-web-app> --stack container
heroku addons:create heroku-postgresql:essential-0 --app <your-api-app>
heroku config:set \
MASTER_ENCRYPTION_KEY="$(python3 -c 'from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())')" \
JWT_SECRET="$(python3 -c 'import secrets; print(secrets.token_urlsafe(32))')" \
DEFAULT_LLM_PROVIDER=openai \
OPENAI_API_KEY=sk-... \
CORS_ORIGINS='["https://your-domain.com"]' \
--app <your-api-app>Heroku notes:
- Heroku provides
DATABASE_URLautomatically via the Postgres addon;config.pyconvertspostgres://topostgresql+asyncpg:// - Alembic migrations run automatically on every container startup
- Frontend
NEXT_PUBLIC_*vars must be passed as--build-argsince Next.js bakes them at build time
docker compose up --buildBoth services are containerized with health checks. The backend runs Alembic migrations before starting.
App spec at .do/app.yaml. Set secrets (MASTER_ENCRYPTION_KEY, JWT_SECRET, API keys) in the DigitalOcean dashboard.
Two GitHub Actions workflows in .github/workflows/:
| Workflow | Trigger | What it does |
|---|---|---|
ci.yml |
Every push/PR to main |
Backend lint, unit + integration tests, frontend type check + build |
deploy.yml |
After CI passes on main |
Builds Docker images, pushes to Heroku, releases, health check |
The system includes automated daily backups (configurable via BACKUP_ENABLED, BACKUP_HOUR, BACKUP_RETENTION_DAYS, BACKUP_DIR). Backup APIs: POST /api/backup/trigger, GET /api/backup/list, GET /api/backup/history.
Manual backup:
# SQLite (development)
cp backend/data/agent.db backend/data/agent.db.bak
# PostgreSQL (production)
pg_dump -Fc "$DATABASE_URL" > backup_$(date +%Y%m%d_%H%M%S).dumpRestore:
# SQLite
cp backend/data/agent.db.bak backend/data/agent.db
# PostgreSQL
pg_restore --clean --if-exists -d "$DATABASE_URL" backup.dumpAlembic migrations:
cd backend && alembic upgrade head # Apply all pending
cd backend && alembic current # Check current state
cd backend && alembic downgrade -1 # Rollback lastChromaDB: Data stored in backend/data/chroma/. To reset: rm -rf backend/data/chroma/ and restart.
See FAQ.md for common setup issues and solutions.