Skip to content

BeyzaNurSarikaya/Smart_City_Microsoft_Fabric_Data_Engineering

Repository files navigation

Smart_City_Microsoft_Fabric_Data_Engineering

🏙️ Smart City Energy Monitoring System

End-to-End Azure Fabric Data Engineering Project

📌 Project Overview

This project is an end-to-end data engineering pipeline built on Microsoft Fabric to collect, process, analyze, and visualize Smart City energy data.

The system ingests data from multiple external APIs, processes it through Bronze → Silver → Gold layers using PySpark, and delivers insights through a Power BI Semantic Model and Dashboard.

All steps are fully automated using Fabric Pipelines with scheduled execution.

🏗️ Architecture Overview

Technologies Used

  • Microsoft Fabric (Lakehouse, Notebooks, Pipelines)

  • PySpark

  • Power BI

Pipeline Layers: project1 drawio (1) 📂 Notebook Structure

This project is organized into 5 main notebooks:

1️⃣ nb_config

  • Purpose: Centralized configuration

  • Lakehouse paths (Bronze / Silver / Gold)

  • Dataset configurations

  • Environment-independent setup

2️⃣ nb_logging

  • Purpose: Operational logging

  • Creates a Delta-based logging table

  • Logs each pipeline step with:

  • Status (STARTED / SUCCESS / FAILED)

  • Notebook name

  • Layer

  • Step name

  • Timestamp

This ensures observability and traceability for all data operations.

3️⃣ nb_functions

  • Purpose: Reusable transformation toolbox

Contains core generic functions such as:

  • clean_column_names

  • explode_and_flatten

  • generate_surrogate_key

  • select_columns_safe

  • save_to_lakehouse

These functions enable config-driven, reusable, and scalable transformations.

4️⃣ nb_generic_bronze_to_silver

Purpose: Bronze → Silver processing

Key features:

  • Reads raw JSON data from Bronze layer

  • Cleans column names

  • Explodes & flattens nested JSON structures

  • Generates surrogate keys

  • Writes structured Delta tables to Silver layer

  • Logs every step using nb_logging

Handled datasets:

  • Weather

  • Air Quality

  • Energy Prices

5️⃣ nb_silver_to_gold

Purpose: Silver → Gold analytical processing

Key logic:

  • Uses Energy data as the main fact table

  • Left joins Weather and Air Quality data

  • Applies window functions:

  • Rolling 3-hour average energy price

Business logic:

  • If current price < last 3-hour average

  • And wind speed is high

  • Label the row as “OPPORTUNITY”

Saves final analytical dataset to Gold layer as Delta table Screenshot 2026-01-03 231326

📊 Semantic Model & Power BI

Gold layer tables are used to create a Semantic Model

A Power BI dashboard was built on top of this model

Enables time-based energy price analysis and opportunity detection Screenshot 2026-01-04 220101

⚙️ Automation & Scheduling

All notebooks are orchestrated using Microsoft Fabric Pipelines

Pipeline flow: Screenshot 2026-01-04 220522 image Pipeline is scheduled for automatic execution

Fully hands-free operation after deployment

🎯 Key Data Engineering Concepts Demonstrated

✔ Medallion Architecture (Bronze / Silver / Gold) ✔ Delta Lake & Schema Evolution ✔ Config-driven transformations ✔ Window functions for time-series analysis ✔ Logging & monitoring ✔ Pipeline orchestration ✔ Semantic modeling & BI integration

🚀 Outcome

This project demonstrates a production-style data engineering workflow using modern cloud-native tools and best practices.

About

"Transforming raw urban datasets into actionable insights through Microsoft Fabric. This project showcases cloud-native data orchestration and modern storage strategies within the OneLake environment."

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors