This document provides an overview of the SDEP (Single Digital Entry Point) technical architecture:
- Overview
- Technology Stack
- Directory Structure
- Backend Architecture
- Request Flow
- Key Endpoints
- Security
- Transaction Management
- Bulk updates
- Exception Handling
- Development Workflow
- Testing Strategy
- SQLite vs PostgreSQL
- Key Configuration Files
SDEP is a FastAPI-based REST API that enables:
- Competent Authorities (CA) to register regulated areas with geospatial data
- Short-Term Rental platforms (STR) to query regulated areas and submit rental activities
- Competent Authorities (CA) to query rental activities
- Compliance with EU Regulation 2024/1028
Production (NL): https://sdep.gov.nl/api/v0/docs
- This is the reference implementation for this repo
- Python: 3.13+
- Framework: FastAPI 0.115+
- ORM: SQLAlchemy 2.0+ (async)
- Migrations: Alembic
- Validation: Pydantic 2.10+
- Authentication: OAuth2 Client Credentials via Keycloak
- Server: Uvicorn
- Container Platform: Docker + Docker Compose
- Identity Provider: Keycloak (OAuth2/OIDC)
- Database: PostgreSQL 15+
- Package Manager: uv (Python)
- Linting: Ruff
- Type Checking: Pyright
- Testing: pytest (with pytest-asyncio, pytest-xdist for parallel execution)
- Pre-commit: Hooks for code quality
- CI/CD: GitLab CI or otherwise (out of scope for this project)
sdep-app/
├── backend/ # Python FastAPI application
│ ├── app/ # Application code
│ │ ├── api/ # API layer (routers, endpoints)
│ │ │ ├── common/ # Shared API components (routers, openapi, security)
│ │ │ │ ├── routers/ # API routers
│ │ │ │ │ ├── auth.py # Authentication router
│ │ │ │ │ ├── ca_activities.py # CA activity endpoints
│ │ │ │ │ ├── ca_areas.py # CA area endpoints
│ │ │ │ │ ├── health.py # Health check router
│ │ │ │ │ ├── ping.py # Ping endpoint
│ │ │ │ │ ├── str_activities.py # STR activity endpoints
│ │ │ │ │ └── str_areas.py # STR area endpoints
│ │ │ │ ├── exception_handlers.py
│ │ │ │ ├── openapi.py
│ │ │ │ └── security.py
│ │ │ ├── common_app.py # Version-independent sub-app (health)
│ │ │ └── v0/ # API version 0
│ │ │ ├── main.py # API v0 entry point
│ │ │ └── security.py # v0 security configuration
│ │ ├── crud/ # Database operations (CRUD)
│ │ │ ├── activity.py
│ │ │ ├── area.py
│ │ │ ├── competent_authority.py
│ │ │ └── platform.py
│ │ ├── db/ # Database configuration
│ │ │ └── config.py # Database session management
│ │ ├── exceptions/ # Custom exceptions
│ │ │ ├── auth.py # Authentication exceptions
│ │ │ ├── base.py # Base exception classes
│ │ │ ├── business.py # Business logic exceptions
│ │ │ ├── handlers.py # Exception handlers
│ │ │ ├── infrastructure.py # Infrastructure exceptions (DB, auth server)
│ │ │ └── validation.py # Validation exceptions
│ │ ├── models/ # SQLAlchemy ORM models
│ │ │ ├── activity.py
│ │ │ ├── address.py
│ │ │ ├── area.py
│ │ │ ├── audit_log.py # Audit log record
│ │ │ ├── competent_authority.py
│ │ │ ├── platform.py
│ │ │ └── temporal.py
│ │ ├── schemas/ # Pydantic schemas (request/response)
│ │ │ ├── activity.py
│ │ │ ├── area.py
│ │ │ ├── auth.py
│ │ │ ├── error.py
│ │ │ ├── health.py
│ │ │ └── validation.py
│ │ ├── security/ # Security utilities
│ │ │ ├── audit.py # Audit logging middleware
│ │ │ ├── audit_retention.py # Background audit log cleanup
│ │ │ ├── bearer.py # Bearer token handling
│ │ │ └── headers.py # Security headers
│ │ ├── services/ # Business logic layer
│ │ │ ├── activity.py
│ │ │ └── area.py
│ │ ├── config.py # Application configuration
│ │ └── main.py # Application entry point
│ ├── alembic/ # Database migrations
│ │ ├── env.py # Alembic environment config
│ │ └── versions/ # Migration scripts
│ │ ├── 001_initial.py # Initial migration
│ │ └── 002_audit_log.py # Audit log table
│ ├── tests/ # Unit tests (mirrors app/ structure)
│ │ ├── api/ # API layer tests
│ │ ├── crud/ # CRUD layer tests
│ │ ├── fixtures/ # Test fixtures and factories
│ │ ├── security/ # Security tests
│ │ ├── services/ # Service layer tests
│ │ └── conftest.py # pytest configuration
│ ├── alembic.ini # Alembic configuration
│ ├── Dockerfile # Backend container image
│ ├── Makefile # Backend-specific make targets
│ ├── pyproject.toml # Python project configuration (uv)
│ └── uv.lock # Locked dependencies
│
├── tests/ # Integration tests + performance tests
│ ├── lib/ # Test library utilities
│ │ └── create_fixture_areas.sh # Area fixture creation
│ ├── perf/ # Performance tests (Locust)
│ │ └── locustfile.py # Bulk activity load test
│ ├── test_auth_client.sh # OAuth2 token acquisition utility
│ ├── test_auth_credentials.sh # Test client credentials flow
│ ├── test_auth_headers.sh # Security headers compliance
│ ├── test_auth_unauthorized.sh # Test unauthorized access rejection
│ ├── test_ca_activities.sh # Test CA activity endpoints
│ ├── test_ca_areas.sh # Test CA area submission
│ ├── test_health_ping.sh # Health check tests
│ ├── test_str_activities.sh # Test STR activity submission
│ ├── test_str_activities_bulk.sh # Test STR bulk activity submission
│ └── test_str_areas.sh # Test STR area query endpoints
│
├── keycloak/ # Keycloak config
│ ├── add-realm-admin.sh # Create realm admin user
│ ├── add-realm-machine-clients.sh # Configure OAuth2 machine clients
│ ├── add-realm-roles.sh # Configure roles
│ ├── add-realm.sh # Initialize realm
│ ├── get-client-secret.sh # Retrieve client secret
│ ├── machine-clients.yaml # Machine client definitions (CA, STR)
│ ├── roles.yaml # Role definitions
│ └── wait.sh # Wait for Keycloak startup
│
├── postgres/ # PostgreSQL initialization
│ ├── clean-app.sql # Database cleanup
│ ├── clean-testrun.sql # Test run cleanup
│ ├── count-app.sql # Row count queries
│ ├── init-keycloak.sql # Keycloak database setup
│ └── init-app.sql # SDEP database setup
│
├── test-data/ # Test data for integration tests
│ ├── shapefiles/ # Shapefile test data (zipped)
│ ├── 01-competent-authority.sql # Competent authority fixtures
│ ├── 02-area-generated.sql # Generated area data
│ └── generate-area-sql.sh # Area data generator script
│
├── docs/ # Documentation
│ ├── API.md # API documentation
│ ├── ARCHITECTURE_TECH.md # Architecture overview (this file)
│ ├── DATAMODEL.md # Data model documentation
│ ├── INTEGRATION_TESTS.md # Integration test documentation
│ ├── LISTING_ACTIVITY.md # Activity listing documentation
│ ├── PERFORMANCE_TESTS.md # Performance test documentation
│ ├── PRE.md # Pre-conditions documentation
│ ├── SECURITY.md # Security documentation
│ ├── WOW.md # Ways of working
│ └── diagrams/ # Architecture and data model diagrams
│ ├── ACTIVITY.excalidraw
│ ├── ACTIVITY.svg
│ ├── DATAMODEL.excalidraw
│ ├── DATAMODEL.svg
│ ├── LISTING.excalidraw
│ └── LISTING.svg
│
├── scripts/ # Utility scripts
│ ├── run-tests.sh # Integration test runner
│ └── run-tests-perf.sh # Performance test runner (Locust)
│
├── .env # Environment variables
├── .gitignore # Git ignore rules
├── .gitlab-ci.yml # GitLab CI/CD configuration
├── CHANGELOG.md # Changelog
├── docker-compose.yml # Multi-container orchestration
├── LICENSE.md # EUPL License
├── Makefile # Root-level make targets
└── README.md # Quick start guide
The backend follows a layered architecture pattern:
- HTTP request/response handling
- Route definitions and parameter validation
- Authentication/authorization enforcement
- Transaction boundary via
get_async_dbdependency (auto-commit on success, rollback on exception)
- Pydantic models for request/response validation
- Data serialization/deserialization
- camelCase aliases for JSON API (e.g.
activityId,areaId,postCode) - Validation (Layer 1: type/format validation)
- Business logic implementation
- Validation (Layer 2: business rules, e.g. area exists, platform lookup/creation)
- Raises
ApplicationValidationErrorfor domain-level errors (e.g. area not found) - No transaction management (delegated to API layer)
- Database operations (Create, Read, Update, Delete)
- Data access abstraction
- SQLAlchemy query construction
- Uses flush (not commit) - defers transaction control to upper layers
- SQLAlchemy ORM models
- Database table definitions
- Relationships and constraints
- Includes
audit_log.pyfor the audit trail (see Security > Audit Logging)
For key patterns, see also Datamodel, Security, and API.
POST /str/activities (single JSON body)
│
├── API Layer (str_activities.py)
│ ├── verify_bearer_token() → auth checks (roles, claims)
│ ├── ActivityRequest (Pydantic) → syntax validation
│ ├── activity.to_service_dict(platform_id, platform_name)
│ └── get_async_db → auto-commit/rollback transaction
│
├── Service Layer (activity.py)
│ ├── create_activity(session, activity_data)
│ ├── Validate area exists → ApplicationValidationError if not
│ ├── Lookup/create platform from JWT claims
│ └── Create activity via CRUD
│
├── CRUD Layer (activity.py)
│ └── flush (not commit)
│
└── Response: 201 + ActivityResponse (camelCase JSON)
POST /str/activities/bulk (JSON body with activities array)
│
├── API Layer (str_activities_bulk.py)
│ ├── verify_bearer_token() → auth checks (roles, claims)
│ ├── BulkActivityRequest (Pydantic) → validates wrapper (min 1, max 1000)
│ └── get_async_db → auto-commit/rollback transaction
│
├── Service Layer (activity_bulk.py) — Application-First Validation
│ ├── Step 1: Per-item Pydantic validation via TypeAdapter
│ ├── Platform resolution (once per batch, version on name change only)
│ ├── Intra-batch dedup (last-wins)
│ ├── Step 2: RI check → single SELECT for area IDs → Python dict
│ ├── Activity versioning → batch UPDATE (mark-as-ended)
│ ├── Step 3: Bulk INSERT (single multi-row INSERT)
│ └── Step 4: Build per-item OK/NOK feedback
│
├── CRUD Layer (activity.py, area.py)
│ └── flush (not commit)
│
└── Response: 201 (all OK) / 200 (partial) / 422 (all NOK)
+ BulkActivityResponse (camelCase JSON)
POST /ca/areas (multipart/form-data: file + optional areaId, areaName)
│
├── API Layer (ca_areas.py)
│ ├── verify_bearer_token() → auth checks (roles, claims)
│ ├── File validation (max 1 MiB)
│ ├── areaId/areaName validation (pattern, length)
│ └── get_async_db → auto-commit/rollback transaction
│
├── Service Layer (area.py)
│ ├── create_area(session, area_id, area_name, filename, filedata, ca_id, ca_name)
│ ├── Lookup/create competent authority from JWT claims
│ └── Create area via CRUD
│
├── CRUD Layer (area.py)
│ └── flush (not commit)
│
└── Response: 201 + AreaResponse (camelCase JSON)
POST /api/v0/auth/token- OAuth2 token endpoint
POST /api/v0/ca/areas- Submit a single area (multipart/form-data: file + optional areaId, areaName)GET /api/v0/ca/areas- List own areas (pagination: offset, limit)GET /api/v0/ca/areas/count- Count own areasGET /api/v0/ca/areas/{areaId}- Download shapefile for own areaDELETE /api/v0/ca/areas/{areaId}- Delete (deactivate) an own areaGET /api/v0/ca/activities- Query rental activities (pagination: offset, limit)GET /api/v0/ca/activities/count- Count activities
GET /api/v0/str/areas- List regulated areas (pagination: offset, limit)GET /api/v0/str/areas/count- Count areasGET /api/v0/str/areas/{areaId}- Download shapefile for areaPOST /api/v0/str/activities- Submit a single activity (JSON body)POST /api/v0/str/activities/bulk- Submit up to 1000 activities in bulk (JSON body) — see API.md § Bulk endpoint
GET /api/health- Health check (unauthenticated)GET /api/v0/ping- Ping endpoint (authenticated)
- Protocol: OAuth2 Client Credentials flow
- Identity Provider: Keycloak
- Token Type: JWT Bearer tokens
- Roles:
sdep_ca- Competent Authority accesssdep_str- STR Platform accesssdep_read- Read operationssdep_write- Write operations
- JWT Claims:
client_id- Maps to platform/competent authority functional IDclient_name- Maps to platform/competent authority namerealm_access.roles- Role-based authorization
AuditLogMiddleware (security/audit.py) tracks all API requests to the audit_log table:
- Records: request ID, client ID, etc.
- Skips low-value paths (health, docs, root)
- Extracts JWT claims without verification (auth happens in route dependencies)
- Writes records asynchronously to avoid blocking responses; audit failures never break the request
Audit retention (security/audit_retention.py) runs a background cleanup loop (started via lifespan) that periodically deletes audit log rows older than the configured retention period (AUDITLOG_RETENTION setting), processing in batches of 1000.
For details, see Security.
SecurityHeadersMiddleware (security/headers.py) adds OWASP-recommended security headers to all responses:
X-Frame-Options: DENY— clickjacking protectionX-Content-Type-Options: nosniff— MIME-sniffing protectionContent-Security-Policy— XSS protection (optional, configurable)Strict-Transport-Security— HTTPS enforcement (optional, usually handled by reverse proxy)Cross-Origin-Opener-Policy,Cross-Origin-Resource-Policy— cross-origin isolationPermissions-Policy— restrict browser featuresReferrer-Policy: no-referrer— prevent information leakage- Strict
Cache-Controlfor sensitive endpoints (auth, activities, areas)
Starlette processes middleware LIFO (last added = outermost = runs first). In main.py:
- SecurityHeadersMiddleware (outermost) — added last, runs first
- AuditLogMiddleware (inner) — added first, runs inside security headers
Two session factories handle different operation types:
| Dependency | Session Type | Transaction | Used by |
|---|---|---|---|
get_async_db |
Write (autoflush=True) | Auto-commit on success, rollback on exception | POST endpoints |
get_async_db_read_only |
Read-only (autoflush=False) | No transaction overhead | GET endpoints |
POST endpoints use get_async_db which wraps the entire request in a single transaction. If any error occurs, the entire operation is rolled back. On success, the transaction is committed automatically.
Next to the single-record POST /str/activities, the bulk endpoint POST /str/activities/bulk is a complement for high-volume STR platforms.
At high volumes (500K–4M records/day, ~6–46 records/second average), PostgreSQL is not the bottleneck — a standard Postgres instance can process thousands of transactions per second. The actual bottlenecks are:
- Network latency — solved by batching 500–1000 items per API call
- Disk I/O (WAL pressure) — solved by multi-row
INSERT ... VALUESinstead of individual inserts
Five implementation strategies were evaluated:
| Option | Strategy | Validation | Mechanism | Verdict |
|---|---|---|---|---|
| 1 | Single, Sync | Direct | 1 request = 1 insert. Enormous network overhead. | Not recommended |
| 2 | Single, Async | Direct | await session.add(). No bulk advantage, high WAL pressure. |
N/A |
| 3 | Bulk, Sync | Direct (App) | API validates batch in Python. Writes "clean" data to DB in one go. | Best for direct feedback |
| 4a | Bulk, Async (Staging) | Deferred | API writes raw JSON to an unlogged Postgres table. Worker validates later. | Best without extra infra |
| 4b | Bulk, Async (Queue) | Deferred | API puts batch on Redis/Kafka. Workers validate and write. | Best for scalability |
Option 3 (Bulk, Sync) is the chosen approach. At this volume, the two async alternatives solve problems that do not apply here:
- Async with staging table (4a): defers validation to a background worker, which means the client does not get per-item OK/NOK feedback in the HTTP response. Adds operational complexity (worker process, polling/callback for results) without a performance need.
- Async with queue (4b): introduces additional infrastructure (Redis/Kafka + consumer workers). Justified only for extreme peak-absorption or cross-service fan-out, neither of which applies at this volume.
Synchronous bulk gives the client immediate, per-item feedback (OK/NOK with error reasons) in the same HTTP response, requires no extra infrastructure beyond the API and database, and keeps the architecture simple — validation and insert happen in one transaction with no background workers or message brokers.
Instead of having the database check each record via savepoints, errors are caught in the application layer:
- Application-level Pydantic validation is many times faster than database savepoints
- Horizontally scalable: add more API nodes under load
- Single reference query per batch (not per record)
- Only "clean" (validated) data reaches the database
- No savepoints or nested transactions needed, which avoids the overhead of extra database round-trips.
| Step | What | How |
|---|---|---|
| 1. Pydantic Check | Validate each item individually against the ActivityRequest schema. Mark failed items as NOK with the error reason. | TypeAdapter(ActivityRequest).validate_python() per item |
| 2. Referential Integrity Check | Fetch all referenced area IDs in a single query. Store in a Python dict for O(1) lookup. Items with unknown areaId → NOK. |
SELECT area_id, id FROM area WHERE area_id IN (...) |
| 3. Bulk Insert | Insert all remaining OK records in a single database operation. | session.execute(insert(Activity), list_of_valid_dicts) |
| 4. Feedback | Return per-item OK/NOK response preserving original order, enriched with status and errorMessage. |
JSON response with summary counts |
| HTTP Status | When |
|---|---|
| 201 Created | All items created successfully (failed == 0) |
| 200 OK | Partial success: some OK, some NOK (succeeded > 0 AND failed > 0) |
| 422 Unprocessable Content | All items failed validation (succeeded == 0) |
| # | Decision | Rationale |
|---|---|---|
| D1 | Per-item Pydantic validation — the request accepts raw dicts, each validated individually in the service layer | One invalid item should not block the other (999) items in the batch. If one item has a missing field, the rest are still processed. |
| D2 | Intra-batch duplicates: last-wins — when the same activityId appears multiple times in a single batch, only the last occurrence is processed; earlier occurrences receive NOK |
Deterministic and predictable for clients. Avoids ambiguity about which version "wins". |
| D3 | Versioning: batch UPDATE before INSERT — existing current versions in the database are marked as ended via a single batch UPDATE ... WHERE activity_id IN (...) before the bulk INSERT creates new versions |
Consistent with single-endpoint versioning semantics, but uses batch operations (1 UPDATE + 1 INSERT) instead of per-item queries. |
| D4 | Platform resolution: version only on name change — platform is resolved once per batch; a new version is only created if the JWT claim (client_name) has changed |
Avoids unnecessary versioning churn when the same platform submits many batches with unchanged credentials. |
| D5 | Deactivated entities rejected — if an activityId has been deactivated (all versions have endedAt set), submitting it again is rejected (NOK) |
Prevents "resurrecting" soft-deleted entities. Consistent with single endpoint behavior. |
| D6 | No ON CONFLICT DO NOTHING — SDEP uses explicit versioning (mark-as-ended + new insert) instead of database-level upsert |
ON CONFLICT DO NOTHING is a general best practice for idempotency in bulk inserts. However, SDEP's data model requires explicit versioning with endedAt timestamps. |
| D7 | Single transaction scope — the entire bulk operation runs in a single transaction; if the bulk INSERT fails, all changes roll back | No partial database state. Consistent with the single endpoint's get_async_db auto-commit/rollback model. |
| D8 | SQLite compatibility — the bulk INSERT and all queries work on both PostgreSQL and SQLite | Unit tests run on SQLite in-memory without requiring PostgreSQL. The StringArray TypeDecorator handles dialect differences. |
All exceptions are handled by global exception handlers defined in app/exceptions/handlers.py and registered in app/api/common/exception_handlers.py. For the complete list of HTTP status codes used by the API, see HTTP status codes.
The table below shows how application exceptions map to HTTP status codes:
| HTTP Status | Exception | Description |
|---|---|---|
| 400 | RequestValidationError |
Invalid query parameters on a GET request (e.g. offset=-1 or limit=abc) |
| 400 / 401 / 403 / 404 / 422 | HTTPException |
Missing/invalid token claims, missing roles, missing credentials, inline input validation, resource not found |
| 401 | InvalidTokenError |
Invalid token (subtype of AuthenticationError) |
| 401 | AuthenticationError |
Invalid or expired token |
| 403 | AuthorizationError |
Insufficient permissions |
| 404 | ResourceNotFoundError |
Resource not found |
| 409 | DuplicateResourceError |
Duplicate resource conflict |
| 422 | RequestValidationError |
Invalid request body on a POST request (e.g. missing required field or wrong value type) |
| 422 | ApplicationValidationError |
Business rule violations (e.g. start time later than end time is NOK ) |
| 500 | Exception |
Catch-all (unexpected code failure) |
| 503 | DatabaseOperationalError |
Database temporarily unavailable |
| 503 | AuthorizationServerOperationalError |
Authorization server temporarily unavailable |
See makefile help
make
- pytest with parallel execution (
-n auto) - Async test support
- Fixtures for database and authentication
- Code coverage tracking
- Run:
cd backend && make test
- Shell scripts using curl
- Test OAuth2 flows
- Test API endpoints with single-item and bulk POST payloads
- Test security headers (OWASP compliance)
- Test validation (Pydantic + business logic)
- Run:
make test - See Integration Tests for detailed test documentation
- Locust-based load testing for the bulk activity endpoint (
POST /str/activities/bulk) - Measures throughput (activities/sec), extrapolates capacity (activities/day), compares against configurable target
- Uses isolated test data (
sdep-test-perf-*prefix) by default; optionally keeps data in database (PERF_KEEP_DATA=true) - Configurable:
PERF_ACTIVITIES_TARGET(total target volume),PERF_USERS(concurrent users to reach target),PERF_MAX_DURATION_SECONDS,PERF_BATCH_SIZE,PERF_KEEP_DATA - Run:
make test-perf(or e.g.make test-perf PERF_USERS=5 PERF_ACTIVITIES_TARGET=1000000 PERF_MAX_DURATION_SECONDS=10) - See Performance Tests for detailed documentation
Unit tests (backend/tests/) automatically switch to an in-memory SQLite database (sqlite+aiosqlite:///:memory:) when no DATABASE_URL environment variable is set. This lets developers run unit tests without PostgreSQL installed or running.
Integration tests (tests/) and Production both use PostgreSQL (postgresql+asyncpg) configured via environment variables (POSTGRES_HOST, POSTGRES_PORT, POSTGRES_DB_NAME, etc.).
| Production | Integration tests (tests/) |
Unit tests (backend/tests/) |
|
|---|---|---|---|
| Database | PostgreSQL | PostgreSQL | SQLite (in-memory) |
| Trigger | always | always | DATABASE_URL not set |
| Persistence | persistent | persistent | ephemeral (per test) |
| Dependency | asyncpg |
asyncpg |
aiosqlite (dev only) |
Because SQLite lacks some PostgreSQL features, the models include dialect adaptors:
StringArray(backend/app/models/activity.py) — usesARRAY(String)on PostgreSQL and JSON-serializedTexton SQLiteCheckConstraint— marked.ddl_if(dialect="postgresql")so they are only applied to PostgreSQL
.env- Environment variables (database, keycloak, backend config)docker-compose.yml- Container orchestrationbackend/pyproject.toml- Python dependencies and tool configurationbackend/alembic.ini- Database migration configurationkeycloak/machine-clients.yaml- Test machine client definitions (OAuth2)keycloak/roles.yaml- Test role definitionsMakefile- Development automation