|
1 | 1 | # EventGate |
2 | | -Python lambda for sending well-defined messages to confluent kafka |
3 | | -assumes AWS Deployment with API Gateway exposure of endpoint |
| 2 | + |
| 3 | +Python AWS Lambda that exposes a simple HTTP API (via API Gateway) for validating and forwarding well-defined JSON messages to multiple backends (Kafka, EventBridge, Postgres). Designed for centralized, schema-governed event ingestion with pluggable writers. |
| 4 | + |
| 5 | +> Status: Internal prototype / early version |
4 | 6 |
|
5 | 7 | <!-- toc --> |
6 | | -- [Lambda itself](#lambda-itself) |
| 8 | +- [Overview](#overview) |
| 9 | +- [Features](#features) |
| 10 | +- [Architecture](#architecture) |
7 | 11 | - [API](#api) |
8 | | -- [Config](#config) |
9 | | -- [Terraform Deplyoment](#terraform-deplyoment) |
| 12 | +- [Configuration](#configuration) |
| 13 | +- [Deployment](#deployment) |
| 14 | + - [Zip Lambda Package](#zip-lambda-package) |
| 15 | + - [Container Image Lambda](#container-image-lambda) |
| 16 | +- [Local Development & Testing](#local-development--testing) |
| 17 | +- [Security & Authorization](#security--authorization) |
| 18 | +- [Writers](#writers) |
| 19 | + - [Kafka Writer](#kafka-writer) |
| 20 | + - [EventBridge Writer](#eventbridge-writer) |
| 21 | + - [Postgres Writer](#postgres-writer) |
10 | 22 | - [Scripts](#scripts) |
| 23 | +- [Troubleshooting](#troubleshooting) |
| 24 | +- [License](#license) |
11 | 25 | <!-- tocstop --> |
12 | 26 |
|
13 | | -## Lambda itself |
14 | | -Hearth of the solution lives in the Src folder |
| 27 | +## Overview |
| 28 | +EventGate receives JSON payloads for registered topics, authorizes the caller via JWT, validates the payload against a JSON Schema, and then forwards the payload to one or more configured sinks (Kafka, EventBridge, Postgres). Schemas and access control are externally configurable (local file or S3) to allow runtime evolution without code changes. |
| 29 | + |
| 30 | +## Features |
| 31 | +- Topic registry with per-topic JSON Schema validation |
| 32 | +- Multiple parallel writers (Kafka / EventBridge / Postgres) — failure in one does not block the others; aggregated error reporting |
| 33 | +- JWT-based per-topic authorization (RS256 public key fetched remotely) |
| 34 | +- Runtime-configurable access rules (local or S3) |
| 35 | +- API-discoverable schema catalogue |
| 36 | +- Pluggable writer initialization via `config.json` |
| 37 | +- Terraform IaC examples for AWS deployment (API Gateway + Lambda) |
| 38 | +- Supports both Zip-based and Container Image Lambda packaging (Container path enables custom `librdkafka` / SASL_SSL / Kerberos builds) |
| 39 | + |
| 40 | +## Architecture |
| 41 | +High-level flow: |
| 42 | +1. Client requests a JWT from an external token provider (link exposed via `/token`). |
| 43 | +2. Client submits `POST /topics/{topicName}` with `Authorization: Bearer <JWT>` header and JSON body. |
| 44 | +3. Lambda resolves topic schema, validates payload, authorizes subject (`sub`) against access map. |
| 45 | +4. Writers invoked (Kafka, EventBridge, Postgres). Each returns success/failure. |
| 46 | +5. Aggregated response returned: `202 Accepted` if all succeed; `500` with per-writer error list otherwise. |
| 47 | + |
| 48 | +Key files: |
| 49 | +- `src/event_gate_lambda.py` – main Lambda handler and routing |
| 50 | +- `conf/*.json` – configuration and topic schemas |
| 51 | +- `conf/api.yaml` – OpenAPI 3 definition served at `/api` |
| 52 | +- `writer_*.py` – individual sink implementations |
15 | 53 |
|
16 | 54 | ## API |
17 | | -POST 🔒 method is guarded by JWT token in standard header "bearer" |
18 | | - |
19 | | -| Method | Endpoint | Info | |
20 | | -|---------|-----------------------|------------------------------------------------------------------------------| |
21 | | -| GET | `/api` | OpenAPI 3 definition | |
22 | | -| GET | `/token` | forwards (HTTP303) caller to where to obtain JWT token for posting to topic | |
23 | | -| GET | `/topics` | lists available topics | |
24 | | -| GET | `/topics/{topicName}` | schema for given topic | |
25 | | -| POST 🔒 | `/topics/{topicName}` | posts payload (after authorization and schema validation) into kafka topic | |
26 | | -| POST | `terminate` | kills lambda - useful for when forcing config reload is desired | |
27 | | - |
28 | | - |
29 | | -## Config |
30 | | -There are 3 configs for this solution (in conf folder) |
31 | | - |
32 | | - - config.json |
33 | | - - this one needs to live in the conf folder |
34 | | - - defines where are other resources/configs |
35 | | - - for SASL_SSL also points to required secrets |
36 | | - - access.json |
37 | | - - this one could be local or in AWS S3 |
38 | | - - defines who has access to post to individual topics |
39 | | - - topics.json |
40 | | - - this one could be local or in AWS S3 |
41 | | - - defines schema of the topics, as well as enumerates those |
42 | | - |
43 | | -## Terraform Deplyoment |
44 | | - |
45 | | -Whole solution expects to be deployed as lambda in AWS, |
46 | | -there are prepared terraform scripts to make initial deplyoment, and can be found in "terraform" folder |
47 | | - |
48 | | -### Zip lambda |
49 | | - |
50 | | -Designated for use without authentication towards kafka |
51 | | - |
52 | | - - create **zip** archive `scripts/prepare.deplyoment.sh` |
53 | | - - upload **zip** to **S3** |
54 | | - - provide terraform variables with tfvars |
55 | | - - `aws_region` |
56 | | - - `vpc_id` |
57 | | - - `vpc_endpoint` |
58 | | - - `resource prefix` |
59 | | - - all terraform resources would be prefixed my this |
60 | | - - `lambda_role_arn ` |
61 | | - - the role for the lambda |
62 | | - - should be able to make HTTP calls to wherever kafka server lives |
63 | | - - `lambda_vpc_subnet_ids` |
64 | | - - `lambda_package_type` |
65 | | - - `Zip` |
66 | | - - `lambda_src_s3_bucket ` |
67 | | - - the bucket where **zip** is already uploaded |
68 | | - - `lambda_src_s3_key` |
69 | | - - name of already uploaded **zip** |
70 | | - - `lambda_src_ecr_image` |
71 | | - - ignored |
72 | | - - `terraform apply` |
73 | | - |
74 | | -### Containerized lambda |
75 | | - |
76 | | -Designated for use with kerberizes SASL_SSL authentication towards kafka, as it requires custom librdkafka compilation |
77 | | - |
78 | | - - build docker (**[follow comments at the top of Dockerfile](./Dockerfile)**) |
79 | | - - upload docker **image** to **ECR** |
80 | | - - provide terraform variables with tfvars |
81 | | - - `aws_region` |
82 | | - - `vpc_id` |
83 | | - - `vpc_endpoint` |
84 | | - - `resource prefix` |
85 | | - - all terraform resources would be prefixed my this |
86 | | - - `lambda_role_arn ` |
87 | | - - the role for the lambda |
88 | | - - should be able to make HTTP calls to wherever kafka server lives |
89 | | - - `lambda_vpc_subnet_ids` |
90 | | - - `lambda_package_type` |
91 | | - - `Image` |
92 | | - - `lambda_src_s3_bucket ` |
93 | | - - ignored |
94 | | - - `lambda_src_s3_key` |
95 | | - - ignored |
96 | | - - `lambda_src_ecr_image` |
97 | | - - already uploaded **image** in **ECR** |
98 | | - - `terraform apply` |
| 55 | +All responses are JSON unless otherwise noted. The POST endpoint requires a valid JWT. |
| 56 | + |
| 57 | +| Method | Endpoint | Auth | Description | |
| 58 | +|--------|----------|------|-------------| |
| 59 | +| GET | `/api` | none | Returns OpenAPI 3 definition (raw YAML) | |
| 60 | +| GET | `/token` | none | 303 redirect to external token provider | |
| 61 | +| GET | `/topics` | none | Lists available topic names | |
| 62 | +| GET | `/topics/{topicName}` | none | Returns JSON Schema for the topic | |
| 63 | +| POST | `/topics/{topicName}` | JWT | Validates + forwards message to configured sinks | |
| 64 | +| POST | `/terminate` | (internal) | Forces Lambda process exit (used to trigger cold start & config reload) | |
| 65 | + |
| 66 | +Status codes: |
| 67 | +- 202 – Accepted (all writers succeeded) |
| 68 | +- 400 – Schema validation failure |
| 69 | +- 401 – Token missing/invalid |
| 70 | +- 403 – Subject unauthorized for topic |
| 71 | +- 404 – Unknown topic or route |
| 72 | +- 500 – One or more writers failed / internal error |
| 73 | + |
| 74 | +## Configuration |
| 75 | +All core runtime configuration is driven by JSON files located in `conf/` unless S3 paths are specified. |
| 76 | + |
| 77 | +Primary file: `conf/config.json` |
| 78 | +Example (sanitized): |
| 79 | +```json |
| 80 | +{ |
| 81 | + "access_config": "s3://<bucket>/access.json", |
| 82 | + "token_provider_url": "https://<token-ui.example>", |
| 83 | + "token_public_key_url": "https://<token-api.example>/public-key", |
| 84 | + "kafka_bootstrap_server": "broker1:9092,broker2:9092", |
| 85 | + "event_bus_arn": "arn:aws:events:region:acct:event-bus/your-bus" |
| 86 | +} |
| 87 | +``` |
| 88 | +Supporting configs: |
| 89 | +- `access.json` – map: topicName -> array of authorized subjects (JWT `sub`). May reside locally or at S3 path referenced by `access_config`. |
| 90 | +- `topic_*.json` – each file contains a JSON Schema for a topic. In the current code these are explicitly loaded inside `event_gate_lambda.py`. (Future enhancement: auto-discover or index file.) |
| 91 | +- `api.yaml` – OpenAPI spec served verbatim at runtime. |
| 92 | + |
| 93 | +Environment variables: |
| 94 | +- `LOG_LEVEL` (optional) – defaults to `INFO`. |
| 95 | + |
| 96 | +## Deployment |
| 97 | +Infrastructure-as-Code examples live in the `terraform/` directory. Variables are supplied via a `*.tfvars` file or CLI. |
| 98 | + |
| 99 | +### Zip Lambda Package |
| 100 | +Use when no custom native libraries are needed. |
| 101 | +1. Run packaging script: `scripts/prepare.deplyoment.sh` (downloads deps + zips sources & config) |
| 102 | +2. Upload resulting zip to S3 |
| 103 | +3. Provide Terraform variables: |
| 104 | + - `aws_region` |
| 105 | + - `vpc_id` |
| 106 | + - `vpc_endpoint` |
| 107 | + - `resource_prefix` (prepended to created resource names) |
| 108 | + - `lambda_role_arn` |
| 109 | + - `lambda_vpc_subnet_ids` |
| 110 | + - `lambda_package_type = "Zip"` |
| 111 | + - `lambda_src_s3_bucket` |
| 112 | + - `lambda_src_s3_key` |
| 113 | +4. `terraform apply` |
| 114 | + |
| 115 | +### Container Image Lambda |
| 116 | +Use when Kafka access needs Kerberos / SASL_SSL or custom `librdkafka` build. |
| 117 | +1. Build image (see comments at top of `Dockerfile`) |
| 118 | +2. Push to ECR |
| 119 | +3. Terraform variables: |
| 120 | + - Same networking / role vars as above |
| 121 | + - `lambda_package_type = "Image"` |
| 122 | + - `lambda_src_ecr_image` (ECR image reference) |
| 123 | +4. `terraform apply` |
| 124 | + |
| 125 | +## Local Development & Testing |
| 126 | +Install Python tooling (Python 3.11 suggested) then: |
| 127 | +``` |
| 128 | +python -m venv .venv |
| 129 | +source .venv/bin/activate |
| 130 | +pip install -r requirements.txt |
| 131 | +pytest -q |
| 132 | +``` |
| 133 | +To invoke handler manually: |
| 134 | +```python |
| 135 | +from src import event_gate_lambda as m |
| 136 | +# craft an event similar to API Gateway proxy integration |
| 137 | +``` |
| 138 | + |
| 139 | +## Security & Authorization |
| 140 | +- JWT tokens must be RS256 signed; the public key is fetched at cold start from `token_public_key_url` (DER base64 inside JSON `{ "key": "..." }`). |
| 141 | +- Subject claim (`sub`) is matched against `ACCESS[topicName]`. |
| 142 | +- Authorization header forms accepted: |
| 143 | + - `Authorization: Bearer <token>` (preferred) |
| 144 | + - Legacy: `bearer: <token>` custom header |
| 145 | +- No token introspection beyond signature & standard claim extraction. |
| 146 | + |
| 147 | +## Writers |
| 148 | +Each writer is initialized during cold start. Failures are isolated; aggregated errors returned in a single `500` response if any writer fails. |
| 149 | + |
| 150 | +### Kafka Writer |
| 151 | +Configured via `kafka_bootstrap_server`. (Future: support auth properties / TLS configuration.) |
| 152 | + |
| 153 | +### EventBridge Writer |
| 154 | +Publishes events to the configured `event_bus_arn` using put events API. |
| 155 | + |
| 156 | +### Postgres Writer |
| 157 | +Example writer (currently a placeholder if no DSN present) demonstrating extensibility pattern. |
99 | 158 |
|
100 | 159 | ## Scripts |
101 | | -Useful scripts for dev and Deployment |
| 160 | +- `scripts/prepare.deplyoment.sh` – build Zip artifact for Lambda (typo in name retained for now; may rename later) |
| 161 | +- `scripts/notebook.ipynb` – exploratory invocation cells per endpoint |
| 162 | +- `scripts/get_token.http` – sample HTTP request for tooling (e.g., VSCode REST client) |
| 163 | + |
| 164 | +## Troubleshooting |
| 165 | +| Symptom | Possible Cause | Action | |
| 166 | +|---------|----------------|--------| |
| 167 | +| 401 Unauthorized | Missing / malformed token header | Ensure `Authorization: Bearer` present | |
| 168 | +| 403 Forbidden | Subject not listed in access map | Update `access.json` and redeploy / reload | |
| 169 | +| 404 Topic not found | Wrong casing or not loaded in code | Verify loaded topics & file names | |
| 170 | +| 500 Writer failure | Downstream (Kafka / EventBridge / DB) unreachable | Check network / VPC endpoints / security groups | |
| 171 | +| Lambda keeps old config | Warm container | Call `/terminate` (internal) to force cold start | |
| 172 | + |
| 173 | +## License |
| 174 | +Licensed under the Apache License, Version 2.0. See the [LICENSE](./LICENSE) file for full text. |
| 175 | + |
| 176 | +Copyright 2025 ABSA Group Limited. |
102 | 177 |
|
103 | | -### Notebook |
104 | | -Jupyter notebook, with one cell for lambda initialization and one cell per method, for testing purposes |
105 | | -Obviously using it requires correct configs to be in place (PUBLIC key is being loaded during initilization) |
| 178 | +You may not use this project except in compliance with the License. Unless required by law or agreed in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. |
106 | 179 |
|
107 | | -### Preapare Deployment |
108 | | -Shell script for fetching python requirements and ziping it together with sources and config into lambda archive |
109 | | -it needs to be uploaded to s3 bucket first before running the terraform. |
| 180 | +--- |
| 181 | +Contributions & enhancements welcome (subject to project guidelines). |
0 commit comments