Additional stream processing examples from previous course years. These are not part of the main workshop but may be useful as reference material.
Python Kafka examples by Irem Erturk, using various libraries.
- json_example/ - producer and consumer using
kafka-pythonwith JSON serialization - avro_example/ - producer and consumer using
confluent-kafkawith Avro serialization and Schema Registry - redpanda_example/ - same as the JSON example but running against Redpanda instead of Kafka, with a local docker-compose setup
- streams-example/faust/ - stream processing with Faust, a Python library for Kafka Streams. Includes windowing, branching, and counting examples.
- streams-example/pyspark/ - Spark Structured Streaming consuming from Kafka, with a Jupyter notebook
- streams-example/redpanda/ - same as the PySpark example but using Redpanda as the broker
- docker/ - Docker Compose files for running Kafka and Spark clusters locally
- resources/ - sample data (rides.csv) and Avro schemas
PyFlink workshop by Irem Erturk. Uses Apache Flink 1.x with a Makefile-based workflow, PostgreSQL sink, and Docker Compose setup. The 2025 stream with Zach Wilson was rewritten into the current 2026 workshop by Alexey, using Flink 2.2, uv, and a step-by-step README.
commands.md - example ksqlDB queries for creating streams, filtering, grouping, and windowed aggregations over Kafka topics. Companion to the ksqlDB and Connect video in the theory section.