Skip to content

inner-outer-space/de-zoomcamp-2024

Repository files navigation

DataTalks Data Engineering ZoomCamp

This repo contains my notes and homework assignments for the 2024 Data Talks Data Engineering Zoomcamp plus additional notes on orchestrators from other years - Airflow (2022), Prefect (2023), and Kestra (2025) The final project for the 2024 course can be found in this repo NYC Collisions Analytics

NOTES

MODULE 1: Docker & SQLTerraform & GCP
MODULE 2A: Orchestration with Mage
MODULE 2B: Orchestration with Airflow
MODULE 2C: Orchestration with Prefect
MODULE 2D: Orchestration with Kestra
MODULE 3: Data Warehouses with BigQuery
MODULE 4: Analytics Engineering with dbt
MODULE 5: Batch with Spark
MODULE 6: Streaming with Kafka

RESOURCES

GIT
Terraform
Docker
PostgreSQL
  • Postgres PostgreSQL official documentation
PGAdmin
Vizualization
Google Cloud Platform
Mage
dbt - data build tool
dlt - data loader tool
Duck DB
Batch Processing - Apache Spark
Streaming - Kafka
Potential Project Data Sources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages