This project is a demonstration ETL pipeline built with Prefect 3, using PostgreSQL as a database, MinIO (S3-compatible) as intermediate storage, and caching for API calls.
The pipeline fetches weather data from a public API, stores raw data in S3, processes and transforms it into a structured DataFrame, and finally loads it into PostgreSQL.
- Prefect 3 orchestration for tasks and flow management
- S3 (MinIO) integration for intermediate data storage
- SQLAlchemy connection to PostgreSQL
- Caching of API calls to avoid redundant requests
- Automated credentials seeding and deployment creation
- Dockerized for local or remote execution
- Docker + Docker Compose
- Python 3.12 + uv (for local runs)
-
Create a
.envfile (see example in repo). -
Start all containers and worker:
make up
-
Access:
- Prefect UI → http://localhost:4200
- MinIO Console → http://localhost:9001
-
Seed Prefect blocks:
uv run seed_credentials.py
-
Deploy flow:
uv run deploy.py
-
Start worker
prefect worker start --pool "basic-pipe" -
Create workpool
prefect work-pool create 'basic-pipe' --type process
-
The MinIO bucket defined in
.envis created automatically on startup. -
‼️ Before the first run, open the Prefect UI → Blocks → “AWS Credentials”, edit it, and manually set the endpoint_url for MinIO. -
Environment variables (API URL, credentials, DB, etc.) are managed via
.envand loaded bydocker-compose. -
The Prefect API URL inside Docker is:
http://prefect-server:4200/api
| Command | Description |
|---|---|
make up |
Start all containers (Prefect + worker) |
make down |
Stop and remove all containers |
make seed |
Seed Prefect credentials locally |
make deploy |
Create Prefect deployment |
make worker |
Run local Prefect worker |
make pool |
Create Prefect worker pool if missing |
- Prefect UI: http://localhost:4200
- MinIO Console: http://localhost:9001
- MinIO API Endpoint (local): http://localhost:9000
- MinIO API Endpoint (docker): http://minio:9000
