|
| 1 | +# CI/CD Pipeline |
| 2 | + |
| 3 | +The project uses GitHub Actions to automate code quality checks, security scanning, testing, and documentation |
| 4 | +deployment. Every push to `main` or `dev` and every pull request triggers the pipeline, with each workflow running in |
| 5 | +parallel to provide fast feedback. |
| 6 | + |
| 7 | +## Pipeline overview |
| 8 | + |
| 9 | +```mermaid |
| 10 | +graph LR |
| 11 | + subgraph "Code Quality" |
| 12 | + Ruff["Ruff Linting"] |
| 13 | + MyPy["MyPy Type Check"] |
| 14 | + end |
| 15 | +
|
| 16 | + subgraph "Security" |
| 17 | + Bandit["Bandit SAST"] |
| 18 | + Trivy["Trivy Container Scan"] |
| 19 | + end |
| 20 | +
|
| 21 | + subgraph "Testing" |
| 22 | + Integration["Integration Tests"] |
| 23 | + end |
| 24 | +
|
| 25 | + subgraph "Documentation" |
| 26 | + Docs["MkDocs Build"] |
| 27 | + Pages["GitHub Pages"] |
| 28 | + end |
| 29 | +
|
| 30 | + Push["Push / PR"] --> Ruff |
| 31 | + Push --> MyPy |
| 32 | + Push --> Bandit |
| 33 | + Push --> Trivy |
| 34 | + Push --> Integration |
| 35 | + Push --> Docs |
| 36 | + Docs -->|main only| Pages |
| 37 | +``` |
| 38 | + |
| 39 | +All workflows trigger on pushes to `main` and `dev` branches, pull requests against those branches, and can be triggered |
| 40 | +manually via `workflow_dispatch`. The documentation workflow additionally filters on path changes to avoid unnecessary |
| 41 | +rebuilds. |
| 42 | + |
| 43 | +## Linting and type checking |
| 44 | + |
| 45 | +Two lightweight workflows run first since they catch obvious issues quickly. |
| 46 | + |
| 47 | +The linting workflow installs dependencies with [uv](https://docs.astral.sh/uv/) and |
| 48 | +runs [Ruff](https://docs.astral.sh/ruff/) against the backend codebase. Ruff checks for style violations, import |
| 49 | +ordering, and common bugs in a single pass. The configuration lives in `pyproject.toml` under `[tool.ruff]`, selecting |
| 50 | +rules from the E, F, B, I, and W categories. |
| 51 | + |
| 52 | +The type checking workflow runs [mypy](https://mypy.readthedocs.io/) with strict settings. It catches type mismatches, |
| 53 | +missing return types, and incorrect function signatures before they reach production. Both workflows use uv's dependency |
| 54 | +caching to skip reinstallation when the lockfile hasn't changed. |
| 55 | + |
| 56 | +## Security scanning |
| 57 | + |
| 58 | +Security runs in two places. The security workflow uses [Bandit](https://bandit.readthedocs.io/) to perform static |
| 59 | +analysis on Python source files, flagging issues like hardcoded credentials, SQL injection patterns, and unsafe |
| 60 | +deserialization. It excludes the test directory and reports only medium-severity and above findings. |
| 61 | + |
| 62 | +The Docker workflow builds the backend image and scans it with [Trivy](https://trivy.dev/). Trivy checks the image |
| 63 | +layers for known vulnerabilities in OS packages and Python dependencies, failing the build if it finds any critical or |
| 64 | +high severity issues that have available fixes. This catches supply chain problems that static analysis would miss. |
| 65 | + |
| 66 | +## Docker build |
| 67 | + |
| 68 | +The Docker workflow builds images using a two-stage approach to optimize layer caching. First it builds a shared base |
| 69 | +image (`Dockerfile.base`) containing Python, system dependencies, and all pip packages. Then it builds the main backend |
| 70 | +image on top of that base, copying only the application code. This separation means dependency changes rebuild the base |
| 71 | +layer while code changes only rebuild the thin application layer. |
| 72 | + |
| 73 | +The base image includes gcc, curl, and compression libraries needed by some Python packages. It |
| 74 | +uses [uv](https://docs.astral.sh/uv/) to install dependencies from the lockfile, ensuring reproducible builds across |
| 75 | +environments. The pinned uv version (currently 0.9.17) prevents unexpected behavior from upstream changes. |
| 76 | + |
| 77 | +## Integration tests |
| 78 | + |
| 79 | +The integration test workflow is the most complex. It spins up the entire stack on a GitHub Actions runner to verify |
| 80 | +that services work together correctly. |
| 81 | + |
| 82 | +```mermaid |
| 83 | +sequenceDiagram |
| 84 | + participant GHA as GitHub Actions |
| 85 | + participant K3s as K3s Cluster |
| 86 | + participant Docker as Docker Compose |
| 87 | + participant Tests as pytest |
| 88 | +
|
| 89 | + GHA->>K3s: Install k3s |
| 90 | + GHA->>Docker: Pre-pull base images |
| 91 | + GHA->>Docker: Build services (bake) |
| 92 | + GHA->>Docker: Start compose stack |
| 93 | + Docker->>Docker: Wait for health checks |
| 94 | + GHA->>Tests: Run pytest |
| 95 | + Tests->>Docker: HTTP requests |
| 96 | + Tests-->>GHA: Coverage report |
| 97 | + GHA->>GHA: Upload to Codecov |
| 98 | +``` |
| 99 | + |
| 100 | +The workflow starts by installing [k3s](https://k3s.io/), a lightweight Kubernetes distribution, so the backend can |
| 101 | +interact with a real cluster during tests. It pre-pulls container images in parallel to avoid cold-start delays during |
| 102 | +the build step. |
| 103 | + |
| 104 | +Before building, the workflow modifies `docker-compose.yaml` using [yq](https://github.com/mikefarah/yq) to create a |
| 105 | +CI-specific configuration. These modifications disable SASL authentication on Kafka and Zookeeper (unnecessary for |
| 106 | +isolated CI), remove volume mounts that cause permission conflicts, inject test credentials for MongoDB, and disable |
| 107 | +OpenTelemetry export to avoid connection errors. The result is a `docker-compose.ci.yaml` that works reliably in the |
| 108 | +ephemeral CI environment. |
| 109 | + |
| 110 | +The [docker/bake-action](https://github.com/docker/bake-action) builds all services with GitHub Actions cache support. |
| 111 | +It reads cache layers from previous runs and writes new layers back, so unchanged dependencies don't rebuild. The cache |
| 112 | +scopes are branch-specific with a fallback to main, meaning feature branches benefit from the main branch cache even on |
| 113 | +their first run. |
| 114 | + |
| 115 | +Once images are built, `docker compose up` starts the stack and waits for health checks. The backend needs MongoDB, |
| 116 | +Redis, Kafka, Schema Registry, and the cert-generator to be ready before it can serve requests. After stabilization, the |
| 117 | +workflow runs pytest against the integration and unit test suites with coverage reporting. Test isolation uses |
| 118 | +per-worker database names and schema registry prefixes to avoid conflicts when pytest-xdist runs tests in parallel. |
| 119 | + |
| 120 | +Coverage reports go to [Codecov](https://codecov.io/) for tracking over time. The workflow always collects container |
| 121 | +logs and Kubernetes events as artifacts, which helps debug failures without reproducing them locally. |
| 122 | + |
| 123 | +## Documentation |
| 124 | + |
| 125 | +The docs workflow builds this documentation site using [MkDocs](https://www.mkdocs.org/) with |
| 126 | +the [Material theme](https://squidfunk.github.io/mkdocs-material/). It triggers only when files under `docs/`, |
| 127 | +`mkdocs.yml`, or the workflow itself change, avoiding rebuilds for unrelated commits. |
| 128 | + |
| 129 | +Before building, the workflow fetches the current OpenAPI spec from the production API and injects it into the docs |
| 130 | +directory. The [swagger-ui-tag](https://github.com/blueswen/mkdocs-swagger-ui-tag) plugin renders this spec as an |
| 131 | +interactive API reference. |
| 132 | + |
| 133 | +On pushes to main, the workflow deploys the built site to GitHub Pages. Pull requests only build without deploying, so |
| 134 | +you can verify the build succeeds before merging. The deployment uses GitHub's native Pages action with artifact |
| 135 | +uploads, which handles cache invalidation and atomic deployments automatically. |
| 136 | + |
| 137 | +## Running locally |
| 138 | + |
| 139 | +You can run most checks locally before pushing. |
| 140 | + |
| 141 | +```bash |
| 142 | +cd backend |
| 143 | + |
| 144 | +# Linting |
| 145 | +uv run ruff check . |
| 146 | + |
| 147 | +# Type checking |
| 148 | +uv run mypy . |
| 149 | + |
| 150 | +# Security scan |
| 151 | +uv tool run bandit -r . -x tests/ -ll |
| 152 | + |
| 153 | +# Unit tests only (fast) |
| 154 | +uv run pytest tests/unit -v |
| 155 | + |
| 156 | +# Full integration tests (requires docker compose up) |
| 157 | +uv run pytest tests/integration tests/unit -v |
| 158 | +``` |
| 159 | + |
| 160 | +For the full integration test experience, start the stack with `docker compose up -d`, wait for services to stabilize, |
| 161 | +then run pytest. The CI workflow's yq modifications aren't necessary locally since your environment likely has the |
| 162 | +expected configuration already. |
| 163 | + |
| 164 | +## Workflow files |
| 165 | + |
| 166 | +| Workflow | File | Purpose | |
| 167 | +|---------------------|----------------------------------|------------------------------| |
| 168 | +| Ruff Linting | `.github/workflows/ruff.yml` | Code style and import checks | |
| 169 | +| MyPy Type Checking | `.github/workflows/mypy.yml` | Static type analysis | |
| 170 | +| Security Scanning | `.github/workflows/security.yml` | Bandit SAST | |
| 171 | +| Docker Build & Scan | `.github/workflows/docker.yml` | Image build and Trivy scan | |
| 172 | +| Integration Tests | `.github/workflows/tests.yml` | Full stack testing | |
| 173 | +| Documentation | `.github/workflows/docs.yml` | MkDocs build and deploy | |
| 174 | + |
| 175 | +All workflows use [uv](https://docs.astral.sh/uv/) for Python dependency management, with caching enabled via |
| 176 | +`astral-sh/setup-uv`. The lockfile at `backend/uv.lock` ensures reproducible installs across CI runs. |
0 commit comments