Backend service for Hiilikartta / climate-map that calculates vegetation + soil carbon estimates for zoning-plan areas.
The service exposes a FastAPI HTTP API that accepts a zipped vector dataset (polygons), runs a PostGIS-backed calculation asynchronously, and stores results to a “state” Postgres database. The heavy spatial work (rasters, segment aggregation) is done against a separate PostGIS GIS database that already contains the required datasets.
- API (
app/main.py, FastAPI): HTTP endpoints, plan persistence, serves results. - Worker (
app/saq_worker.py, SAQ): background jobs; calculates per-feature results and updates the state DB. - Redis: SAQ job queue + distributed GIS throttling semaphore.
- State DB (Postgres): stores uploaded plans + calculation outputs (JSONB).
- GIS DB (external PostGIS): provides rasters/segments/regions needed by the calculation.
docker-compose.*.yml spins up everything except the GIS DB.
- Web/API: FastAPI, Uvicorn (dev) / Gunicorn+UvicornWorker (prod)
- Async DB: SQLAlchemy 2.x (async) +
asyncpg - Migrations: Alembic (+
alembic-postgresql-enum) - Geo: GeoPandas + Shapely + Rasterio stack
- Async jobs: SAQ + Redis
- Auth: Zitadel token introspection via Authlib + Requests
Large payload endpoints (GET /calculation, GET /plan, GET /plan/external) return gzip-compressed bodies (Content-Encoding: gzip). Many HTTP clients handle this automatically; with curl use --compressed.
POST /calculation?id=<uuid>&visible_id=<string>&name=<string>multipart/form-datawith fieldfile(a zipped dataset readable by GeoPandas).- Creates/updates a plan and enqueues a background job (
calculate_piece). - Auth is optional; if a valid token is provided, the plan is associated with that user.
GET /calculation?id=<uuid>202while processing,200when finished,206if the plan ended in an error state.- When finished: returns
data.totalsanddata.areas(GeoJSON stored in DB).
GET /plan/external?id=<uuid>- Public “share” endpoint: returns
{id, name, report_data?}.
- Public “share” endpoint: returns
Authenticated (Zitadel) endpoints:
PUT /plan?id=<uuid>&visible_id=<string>&name=<string>(upload/replace plan data)GET /plan?id=<uuid>(fetch a user’s plan)DELETE /plan?id=<uuid>GET /user/plans
FastAPI docs: GET /docs
- Upload a plan to
POST /calculation ... - Poll
GET /calculation?id=...untilcalculation_status == FINISHED - Parse
data.totals+data.areas
Example (dev):
PLAN_ID=$(python -c 'import uuid; print(uuid.uuid4())')
curl --compressed -F "file=@tests/data/test-data-small-polygon.zip" \
"http://localhost:8000/calculation?id=$PLAN_ID&visible_id=demo&name=Demo"
curl --compressed "http://localhost:8000/calculation?id=$PLAN_ID"The current implementation is documented in documentation/calculation.md (“new model”). In short, for each polygon the calculator produces:
- vegetation + soil base stocks from PostGIS rasters (converted from tC/ha to tCO2),
- future deltas on existing land from segment variables + curve series,
- future deltas on changed land from annual sequestration coefficients (CSV),
- outputs for
nochangevsplannedscenarios forcurrent_yearand 2030..2095 (5y steps).
From documentation/calculation.md:
geometry: polygon/multipolygon (input assumed EPSG:4326; reprojected for area math)zoning_code: land-use code used for coefficient lookup- optional land-cover shares (percentages) and soil-change factor; see the doc for defaults and accepted aliases
GIS DB (PostGIS) tables/rasters (see documentation/calculation.md for details):
hiilikartta_kasvillisuudenhiili_2021_tchahiilikartta_maaperanhiili_2023_tchaluke_mvmisegmentit_id_kokomaaluke_mvmisegmentit_muuttujat_kokomaamaakunta(geom,natcode)
Repo data files (loaded on API startup via app/utils/data_loader.py):
data/BiomassCurves.txtdata/SoilCurves.txtdata/aluekertoimet.csvdata/Hiilikartta_Kasvillisuuden_ja_maaperan_hiilensidonta_kayttotarkoitusluokittain.csv
GIS operations are intentionally throttled to protect the GIS DB:
- local (per-process) semaphore:
GIS_LOCAL_MAX_CONCURRENT - distributed (cross-process) semaphore via Redis:
GIS_DISTRIBUTED_MAX_CONCURRENT,GIS_SLOT_TTL - Postgres
statement_timeout:GIS_STATEMENT_TIMEOUT_SECONDS
When the GIS DB is at capacity, jobs are re-enqueued later (GisRetryLaterError). If a single feature times out, the worker skips that feature and continues.
- Docker Engine / Docker Desktop + Compose v2 (
docker compose) - Access to a PostGIS GIS DB with the required datasets
One-time: create the external Docker network used by docker-compose.dev.yml:
docker network create climate-map-networkCreate your local env file:
cp .env.template .envAt minimum you must set the GIS connection values:
GIS_PG_USER,GIS_PG_PASSWORD,GIS_PG_DB,GIS_PG_HOST,GIS_PG_PORT
Optional tuning:
SAQ_WORKERS_COUNT(worker process count; dev default is 3)
Safety rails:
- Dev containers refuse to start unless
STATE_PG_DBcontainsdev. - Tests refuse to run unless
STATE_PG_TEST_DBcontainstest(tests run Alembic downgrade/upgrade against the test DB).
For authenticated endpoints, also set:
ZITADEL_DOMAIN,ZITADEL_CLIENT_ID,ZITADEL_CLIENT_SECRET
docker compose up --buildDefault URLs (from .env.template):
- API:
http://localhost:${APP_PORT}(docs at/docs) - Jupyter:
http://localhost:${NOTEBOOK_PORT}(token:NOTEBOOK_TOKEN) - SAQ Web UI:
http://localhost:${SAQ_WEB_PORT}
The state DB schema is managed via Alembic (alembic/). Migrations will attempt to create the required pgcrypto extension (for gen_random_uuid()). If your DB role cannot create extensions, enable it once manually:
docker compose exec state-db-dev sh -lc \
'psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c "CREATE EXTENSION IF NOT EXISTS \"pgcrypto\";"'Then run migrations:
docker compose exec app-dev poetry run alembic upgrade headdocker-compose.prod.yml runs the API + worker + Redis and is designed to be attached to an existing reverse-proxy network (proxy-net) with Traefik.
To support running multiple stacks on the same Docker host (e.g. prod + test) without Redis cross-talk, the worker + Redis live on an internal per-stack network (app-net), and only the API is attached to proxy-net.
By default, prod containers refuse to start if the state DB is not at the latest Alembic revision. To run migrations automatically on API startup, set STATE_DB_MIGRATION_MODE=upgrade (the worker is check-only and never runs migrations).
Key env vars:
DOMAIN(Traefik host rule)APP_PORT(host port for the API container)REDIS_DATA_PATH(Redis persistence path for prod)STATE_DB_MIGRATION_MODE(checkto refuse start;upgradeto runalembic upgrade headon API startup)
app/: application codeapp/main.py: FastAPI app + routesapp/saq_worker.py: SAQ queue + worker functionsapp/calculator/: calculation implementationapp/db/: async DB engines, GIS queries, state DB access, throttlingapp/auth/: Zitadel token introspection
alembic/: Alembic migrations for the state DBdata/: lookup tables + curve inputs used by the calculatordocumentation/calculation.md: authoritative calculation specdocker-compose.*.yml,docker-entrypoint*.sh: local/prod wiringtests/: integration/smoke testssql/: reference SQL snippets (not the migration source of truth)
- Python: 3.11 + Poetry (
pyproject.toml) - Formatting:
poetry run black . - Types: keep/extend existing type hints; avoid introducing untyped public APIs where practical
- GIS DB safety: use the throttled helpers in
app/db/gis.py(don’t open raw GIS sessions without a good reason)
.devcontainer/devcontainer.json uses docker-compose.dev.yml and starts:
app-dev, worker-dev, redis-dev, state-db-dev, state-db-test.
It also configures common VS Code extensions for Python, Jupyter, Docker, and formatting.
Tests are few but cover the most important API flows:
tests/api/main_test.py: calculation lifecycle + output checks + retry/timeout behaviorstests/modules/db/test_gis.py: smoke tests for GIS query helpers
Running tests requires:
- a running
state-db-test(started bydocker-compose.dev.yml), and - a reachable GIS DB containing the required datasets (tests execute real GIS queries).
Run:
docker compose exec app-dev poetry run pytest