Skip to content

Commit 63ceae6

Browse files
committed
docs: add infrastructure documentation and security guidelines
1 parent f2655cf commit 63ceae6

File tree

6 files changed

+267
-18
lines changed

6 files changed

+267
-18
lines changed

README.md

Lines changed: 94 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,103 @@
1-
# spamguard-mlops
1+
# SpamGuard MLOps
22

3-
## Referências
3+
Production-minded, didactic MLOps template for spam detection. Service-oriented structure with clear seams to evolve from a baseline to production.
44

5-
- [Elysia](https://elysiajs.com/)
6-
- [Drizzle](https://drizzle-orm.com/)
7-
- [Mantine](https://mantine.dev/)
8-
- [Recharts](https://recharts.org/)
9-
- [Bun](https://bun.sh/)
10-
- [Fly.io](https://fly.io/)
11-
- [GitHub Actions](https://github.com/features/actions)
5+
## Start here
126

13-
## Tools
7+
- Executive summary (1‑min): `infra/ONE-PAGER.md`
8+
- Quickstart: see the next section below
149

15-
- [DVC](https://github.com/iterative/dvc)
16-
- [MLflow](https://github.com/mlflow/mlflow)
10+
## Value proposition
1711

18-
## Fly.io
12+
- Fast: minutes from dataset to serving a model
13+
- Clear: simple stack for onboarding and interviews
14+
- Extensible: upgrade DB, swap model, add observability without rewrites
15+
- Portable: Bun-native dev, Docker/K8s-ready
1916

20-
```bash
21-
fly auth login
22-
fly apps create spamguard-mlflow
17+
## Repository structure
18+
19+
```
20+
artifacts/ # Centralized model artifacts (JSON)
21+
data/raw/dataset.csv # Sample dataset (CSV)
22+
dashboard/ # React + Mantine UI (demo/dashboard)
23+
inference/ # API (Bun + Elysia) + Drizzle ORM + SQLite
24+
training/ # Training job (Natural Naive Bayes)
25+
infra/ # Dockerfiles, Compose, K8s manifests, docs
26+
```
27+
28+
## Quickstart (local)
29+
30+
Prereq: Bun installed (https://bun.sh/)
31+
32+
Install dependencies (workspaces):
33+
34+
```sh
35+
bun install
36+
```
37+
38+
Run DB migrations (if needed):
39+
40+
```sh
41+
bun --cwd inference run db:generate
42+
bun --cwd inference run db:migrate
43+
```
44+
45+
Train and promote a model:
46+
47+
```sh
48+
bun --cwd training run train
49+
ls -l artifacts
50+
```
51+
52+
Start services (inference API + dashboard):
53+
54+
```sh
55+
bun run dev
56+
# API: http://localhost:3001
57+
# UI: http://localhost:5173 (Vite default) or as configured in dashboard
2358
```
2459

25-
```bash
26-
fly deploy
60+
## Quickstart (Docker Compose)
61+
62+
```sh
63+
docker compose -f infra/docker-compose.yml up -d --build inference dashboard
64+
# Optional on-demand training job
65+
docker compose -f infra/docker-compose.yml --profile training run --rm training
2766
```
67+
68+
## API (short)
69+
70+
- POST `/predict` -> `{ prediction: [{ label, value }, ...] }`
71+
- GET `/dashboard` -> list of runs with metrics
72+
Details: see `infra/API.md`.
73+
74+
## CI/CD
75+
76+
GitHub Actions pipeline (`.github/workflows/ci-cd.yml`):
77+
78+
1. Install deps
79+
2. Generate + apply DB migrations
80+
3. Train and (if better) promote model, saving artifact to `artifacts/`
81+
4. Build dashboard and upload artifact (deploy step is a placeholder)
82+
83+
## Tech stack
84+
85+
- Runtime: Bun (JavaScript ESM)
86+
- API: Elysia, @elysiajs/cors
87+
- ORM/DB: Drizzle ORM + SQLite
88+
- ML: Natural (Naive Bayes)
89+
- UI: React, Mantine, Recharts, Vite
90+
- CI: GitHub Actions
91+
- Infra: Docker, Docker Compose, Kubernetes manifests
92+
93+
## Architecture & Ops
94+
95+
- One-pager (simple): `infra/ONE-PAGER.md`
96+
- API reference: `infra/API.md`
97+
- Runbook (local, Compose, K8s, troubleshooting): `infra/RUNBOOK.md`
98+
- Security checklist: `infra/SECURITY.md`
99+
- Infra index: `infra/README.md`
100+
101+
## Notes
102+
103+
- This is an educational template. For production, upgrade DB (e.g., Postgres), add auth/rate limiting, observability, and persistent volumes.

infra/API.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# API
2+
3+
Base URL (local): http://localhost:3001
4+
5+
## POST /predict
6+
- Description: Returns ranked spam/ham classifications for a given message.
7+
- Request
8+
- Headers: `Content-Type: application/json`
9+
- Body:
10+
```json
11+
{ "message": "congratulations! you won a prize" }
12+
```
13+
- Response (200)
14+
```json
15+
{
16+
"prediction": [
17+
{ "label": "spam", "value": 0.92 },
18+
{ "label": "ham", "value": 0.08 }
19+
]
20+
}
21+
```
22+
- Error
23+
- 400: invalid body
24+
- 500: internal error
25+
26+
- cURL
27+
```bash
28+
curl -s -X POST http://localhost:3001/predict \
29+
-H "Content-Type: application/json" \
30+
-d '{"message":"congratulations! you won a prize"}' | jq
31+
```
32+
33+
## GET /dashboard
34+
- Description: Returns a list of training runs (newest first) with metrics.
35+
- Response (200)
36+
```json
37+
[
38+
{
39+
"id": 1739999999999,
40+
"experimentId": 1,
41+
"createdAt": "2025-08-19T12:34:56.000Z",
42+
"gitCommit": "a1b2c3d",
43+
"metrics": { "accuracy": 0.90, "f1Score": 0.90 },
44+
"modelArtifactPath": "artifacts/model_1739999999999.json",
45+
"isProduction": true
46+
}
47+
]
48+
```
49+
50+
## CORS
51+
- CORS is enabled in the inference service (`@elysiajs/cors`). The dashboard (Vite) can call the API directly in local dev.
52+
53+
## Notes
54+
- If you get `"Model is not loaded"`, run a training job: `bun --cwd training run train`.
55+
- Artifacts are JSON files saved under `artifacts/` and referenced from the DB.

infra/ONE-PAGER.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# SpamGuard MLOps — Executive Summary (1‑minute read)
2+
3+
## What it is
4+
5+
A minimal, production‑minded template for spam detection with three services:
6+
7+
- Inference API (Bun + Elysia) serving a Naive Bayes model
8+
- Training job that promotes the best model automatically
9+
- Dashboard (React + Mantine) to view runs/metrics
10+
11+
State is simple and local by default:
12+
13+
- SQLite DB at `inference/main.db` via Drizzle ORM
14+
- Model artifacts (JSON) in `artifacts/`
15+
16+
## Why it matters
17+
18+
- Ship fast: minutes from dataset → trained → served
19+
- Stay clear: small, readable stack for onboarding and demos
20+
- Evolve safely: promotion only when metrics improve
21+
22+
## Run in 60 seconds
23+
24+
```sh
25+
bun install
26+
bun --cwd inference run db:generate && bun --cwd inference run db:migrate
27+
bun --cwd training run train
28+
bun run dev # API: http://localhost:3001 UI: http://localhost:5173
29+
```
30+
31+
## KPIs
32+
33+
- Accuracy vs. dataset (proxy for F1)
34+
- Promotion rule: only promote if accuracy improves
35+
- Lead time to change: single CI run from train → promote → serve
36+
37+
## Upgrade paths
38+
39+
- DB: SQLite → Postgres
40+
- Tracking: add MLflow (experiments/artifacts)
41+
- Ops: add auth, rate limiting, and observability (Prometheus/Grafana)
42+
- Deploy: use provided Docker/K8s scaffolding and your registry

infra/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
This directory contains infrastructure artifacts to run the monorepo in containerized or orchestrated environments.
44

55
Structure:
6+
67
- docker/
78
- Dockerfile.inference: image for the inference service (Elysia/Bun)
89
- Dockerfile.dashboard: image for the dashboard (Vite -> Nginx)
@@ -13,6 +14,16 @@ Structure:
1314
- dashboard-deployment.yaml, dashboard-service.yaml
1415

1516
Notes:
17+
1618
- Model artifacts are centralized under `artifacts/` at the repo root and are mounted into containers.
1719
- The SQLite database (`inference/main.db`) is shared between training and inference. In compose, the DB is ephemeral for simplicity; you can bind-mount if needed.
1820
- Images in the k8s directory are placeholders; update them with your registry.
21+
22+
## Docs index
23+
24+
### Start here (essential)
25+
26+
- Executive summary (simple): `infra/ONE-PAGER.md`
27+
- API reference (quick check): `infra/API.md`
28+
- Security checklist: `infra/SECURITY.md`
29+
- Quickstart: see root `README.md` (Local and Docker Compose)

infra/RUNBOOK.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Runbook
2+
3+
Operational guidance to run, observe, and troubleshoot the system.
4+
5+
## Quick run (local)
6+
- Install deps:
7+
```sh
8+
bun install
9+
```
10+
- Migrate DB (first run or schema changes):
11+
```sh
12+
bun --cwd inference run db:generate && bun --cwd inference run db:migrate
13+
```
14+
- Start API + dashboard:
15+
```sh
16+
bun run dev
17+
# API: http://localhost:3001 UI: http://localhost:5173
18+
```
19+
- Train and promote model:
20+
```sh
21+
bun --cwd training run train && ls -l artifacts
22+
```
23+
24+
## Docker Compose
25+
- Up API + dashboard:
26+
```sh
27+
docker compose -f infra/docker-compose.yml up -d --build inference dashboard
28+
```
29+
- One-off training:
30+
```sh
31+
docker compose -f infra/docker-compose.yml --profile training run --rm training
32+
```
33+
- Down:
34+
```sh
35+
docker compose -f infra/docker-compose.yml down
36+
```
37+
38+
## Kubernetes (optional)
39+
- Update images in `infra/k8s/*.yaml` and apply manifests. Replace `emptyDir` with PVCs for persistence.
40+
41+
## Health
42+
- API: `POST /predict` returns classifications; `GET /dashboard` returns runs
43+
- UI: open the dashboard URL and check charts load
44+
45+
## Troubleshooting
46+
- Missing model: run training; verify `isProduction` and artifact path in DB
47+
- CORS: ensure API at http://localhost:3001 and CORS enabled in `inference/src/index.js`
48+
- Migrations: rerun generate/migrate; inspect `inference/drizzle/` and `inference/main.db`
49+
- Artifacts in containers: ensure `../artifacts:/app/artifacts` volume or PVC

infra/SECURITY.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Security (checklist)
2+
3+
Quick checklist for minimal hardening. Deep-dive: `infra/deep-dive/SECURITY.DEEP.md`.
4+
5+
- AuthN/Z: Protect inference API (API key/JWT). Limit roles for dashboard.
6+
- Rate limiting: Prevent abuse on `/predict`.
7+
- Secrets: Never commit. Store in CI/CD secrets/manager.
8+
- TLS: Encrypt in transit (HTTPS/ingress). Avoid plain HTTP over public networks.
9+
- Data at rest: If using managed DB/object storage, enable encryption + backups.
10+
- Logs: Redact PII/tokens. Use structured logs.
11+
- CORS: Restrict origins in production.
12+
- Dependencies: Pin/update; enable Dependabot/Renovate.
13+
- Containers: Minimal images; add `.dockerignore`; scan images.
14+
- CI/CD: Branch protections, PR reviews, SAST/dep/container scans.
15+
- Observability: Metrics + alerts for latency, errors, model load failures.
16+
- Privacy: Document retention/deletion if handling user data.

0 commit comments

Comments
 (0)