| title | Operations |
|---|---|
| description | Monitoring, error tracking, logs, database backups, health checks, and container validation. |
All Tale services expose a Prometheus /metrics endpoint on the internal Docker network. To enable access from outside, set a bearer token in your .env file:
METRICS_BEARER_TOKEN=your-secret-token-hereMetrics are then available at these endpoints:
| Service | Metrics endpoint |
|---|---|
| Crawler | https://yourdomain.com/metrics/crawler |
| RAG | https://yourdomain.com/metrics/rag |
| Platform (Bun) | https://yourdomain.com/metrics/platform |
| Convex | https://yourdomain.com/metrics/convex |
Note: The Convex backend exposes over 260 built-in metrics covering query latency, mutation throughput, and scheduler performance.
When the token is unset, all /metrics/* endpoints return 401.
scrape_configs:
- job_name: tale-crawler
scheme: https
metrics_path: /metrics/crawler
authorization:
credentials: your-secret-token-here
static_configs:
- targets: ['your-tale-host.com']
# Repeat for: tale-rag, tale-platform, tale-convex
# changing metrics_path accordinglyTale supports Sentry and compatible alternatives such as GlitchTip for error tracking. Set your DSN in .env:
SENTRY_DSN=https://your-key@your-sentry-host/project-idIf SENTRY_DSN is not set, error tracking is off and errors only appear in Docker logs.
All service logs go to Docker stdout with automatic rotation at 10 MB per file, keeping 3 files per service.
# Stream all service logs
docker compose logs -f
# Stream logs for a specific service
docker compose logs -f rag
# View recent logs without streaming
docker compose logs --tail=100 platformTo create a database snapshot:
docker exec tale-db pg_dump -U tale tale > backup-$(date +%Y%m%d).sqlTo restore from a backup:
docker exec -i tale-db psql -U tale tale < backup-20260101.sqlEach service has a health check endpoint:
| Endpoint | What it checks |
|---|---|
GET /health |
Proxy is running and listening |
GET /api/health |
Platform is up and Convex backend is reachable |
http://localhost:8001/health |
RAG service is running and database pool is connected |
http://localhost:8002/health |
Crawler service and browser engine are ready |
To validate that all containers are healthy after a deployment or configuration change, run the container smoke test:
bun run docker:testThis builds all images, starts them on non-conflicting ports, validates health endpoints and inter-service connectivity, then tears down. It is the same test that runs in CI on every pull request.
For image-level validation (OCI labels, no secrets, size budgets):
bun run docker:test:imageEach container image has a size budget enforced by CI. Current sizes and budgets:
| Service | Current size | Budget |
|---|---|---|
| Crawler | ~1.85 GB | 2.1 GB |
| RAG | ~515 MB | 600 MB |
| Platform | ~2.58 GB | 2.9 GB |
| DB | ~1.06 GB | 1.2 GB |
| Proxy | ~88 MB | 100 MB |
If an image exceeds its budget after a change, bun run docker:test:image will fail. See the container architecture page for details on multi-stage build strategies that keep images lean.