This project demonstrates real-time data streaming for energy consumption using Apache Kafka. It simulates consumption data generation, Kafka topic streaming, and real-time processing using a Kafka consumer.
- Python
- Apache Kafka
- Kafka Python library
- Pandas
- JSON
├── producer.py # Sends data to Kafka topic
├── consumer.py # Reads and processes data from Kafka
├── requirements.txt # Required Python packages
├── imdex.html # Shows histogram of sales and prices
Ensure the following are installed on your system:
- Python 3.7+
- Apache Kafka and Zookeeper
- Kafka-Python:
pip install kafka-python - Other dependencies:
pip install -r requirements.txt
Start Zookeeper:
zookeeper-server-start.sh config/zookeeper.propertiesIn a new terminal, start Kafka:
kafka-server-start.sh config/server.propertiesMake sure both Zookeeper and Kafka servers are running before proceeding.
kafka-topics.sh --create --topic energy-data --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1Send the generated data to Kafka topic:
python producer.pyThis script reads data from the file and pushes it to the energy-data topic.
Consume the streamed data:
python consumer.pyThis script reads messages from the Kafka topic and processes or prints them to the console.
This shows the histogram of sales and prices in the web page.