Skip to content

This repository contains a set of three projects that showcase modern data engineering, reporting, and governance capabilities using Microsoft Fabric. The work is structured to represent the responsibilities of a Senior / Lead Analytics Engineer within a large energy company operating at scale.

License

Notifications You must be signed in to change notification settings

Dalbee/energy-analytics-fabric-bi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

District Energy Intelligence Platform: From Silos to Strategic Assets

Role: Lead Data Engineer & Strategic Data Analyst

Tech Stack & Architectural Decision: Microsoft Fabric (OneLake, PySpark, SQL Endpoint), Power BI (DirectQuery Mode, Prototype, Import Mode Production, Star Schema).

Architectural Rationale: The solution was initially prototyped in DirectQuery to evaluate the potential for real-time telemetry streaming from Microsoft Fabric. However, for the final production suite, I made the strategic decision to migrate to Import Mode. This transition was executed to leverage the VertiPaq in-memory engine, ensuring sub-second interactivity and high-concurrency performance for complex ESG and financial DAX measures that require maximum computational efficiency.

Core Impact: Unified siloed telemetry to recover €4.9M in avoidable efficiency costs.

🔗 View Live Power BI Dashboard


🎯 The Vision: Bridging the Gap Between Waste & Finance

In the modern energy sector, data is often trapped in operational silos—telemetry, emissions tracking, and grid consumption exist in isolation. This platform was engineered within Microsoft Fabric to unify these streams into a Single Source of Truth, enabling data-driven decisions across three critical pillars:

  • Operational Excellence: Optimizing plant load factors to reclaim 23.41M MWh of unused capacity.
  • ESG Leadership: Real-time tracking of carbon intensity, maintaining a fleet average of 0.05 kg/MWh (Target: 0.10).
  • Grid Resilience: Monitoring heat balance to sustain a 102.59% self-sufficiency rating.

💰 Key Outcome: Financial & Operational Impact

By unifying disparate datasets, I identified critical operational risks that were previously buried in siloed telemetry:

Metric Result Strategic Insight
Efficiency Risk 81,744 MWh Identified recorded excess waste across the fleet.
Financial Impact €4,904,640 Translated technical waste into a recovery narrative.
Primary Driver Tampere Plant Pinpointed Nuclear production as the leading risk contributor.
Fleet Utilization 13.2% Highlighted massive scaling opportunity without further CAPEX.

🏗️ Technical Implementation (The "How")

The project is divided into three specialized workstreams/projects to demonstrate full-stack Fabric proficiency:

  • Design: Implemented a robust Star Schema to handle 100M+ rows of sensor data.

  • Innovation: Utilized Calculation Groups for dynamic "Actual vs. Budget" energy comparisons.

  • Key Visuals: Power BI Dashboard

    Star Schema DirectQuery Mode - Prototype Star Schema: DirectQuery Mode - Prototype

    Star Schema Import Mode - Production Star Schema: Import Mode - Final Production

Executive Summary: An end-to-end Microsoft Fabric solution that refactors siloed plant operations data into strategic assets, translating technical grid waste into actionable financial insights and ESG leadership metrics.

  • Architecture: Multi-layered Medallion (Bronze/Silver/Gold) architecture in OneLake.
  • Transformation: Used PySpark for complex energy unit conversions and validation logic.
  • Key Visuals: Architecure Diagrams

Architecure Diagrams

Architecure Diagrams

Validation Logic: Architecure Diagrams

  • Security: Defined Row-Level Security (RLS) to ensure plant managers only see their specific telemetry.
  • DevOps: Established a CI/CD workflow using Fabric Deployment Pipelines and GitHub integration.

🛠️ Technology Stack

  • Storage: Microsoft Fabric OneLake & Lakehouse (Delta Lake format).

  • Compute: Spark (PySpark Notebooks) & SQL Analytics Endpoints.

  • Orchestration: Fabric Data Pipelines.

  • Reporting: Power BI (Import Mode for performance, Star Schema architecture).

  • DevOps: GitHub Repository Integration & Fabric Deployment Pipelines.


🎯 Key Objectives

  • Scalability: Design a platform capable of handling millions of rows of operational energy data.

  • Single Source of Truth: Compute complex KPIs upstream in Spark to ensure consistency across all downstream tools.

  • Governance: Implement a "Least Privilege" security model and a structured release process.

  • Performance: Balance the heavy lifting of Spark with the lightning-fast interactivity of Power BI.


📁 Repository Structure

energy-analytics-fabric-bi/
├── docs/                      # Global Architecture & Governance Blueprints
├── projects/
│   ├── project1-energy-bi/    # PBIX metadata, DAX measures, Theme files
│   ├── project2-engineering/  # PySpark Notebooks, Pipeline JSON definitions
│   └── project3-governance/   # CI/CD configs & RLS security documentation
└── README.md                  # Portfolio Home Page

Together, these projects demonstrate a complete, production-aligned Fabric ecosystem including ingestion, transformation, modelling, reporting, governance, operations, and deployment automation.


🚀 Project Summaries

A full business intelligence solution built using Power BI and Fabric semantic models. Includes data modelling using a Star Schema, calculation groups, and energy-sector KPIs (production, demand, emissions).


An end-to-end data engineering workflow using PySpark and Medallion Architecture. It covers Lakehouse-based ingestion, dimensional modeling, and pipeline orchestration with automated quality gates.


Illustrates enterprise management of Fabric: Workspace strategy (Dev/Test/Prod), RLS/OLS security models, and CI/CD using deployment pipelines and Git integration.


Documentation

All architecture, governance, and design materials are located in /docs.

Key documents include:

These documents are written to reflect enterprise standards.


🧭 How to Navigate This Repository

  1. Start with the /docs folder to understand the architecture and governance design.

  2. Explore Engineering: See Project 2 for the Spark ingestion and transformation pipelines.

  3. Explore Analytics: See Project 1 for the BI solution and Star Schema.

  4. Explore Operations: See Project 3 for governance and deployment artefacts.


📧 Contact & Professional Context

This repository is a demonstration of architectural capability in the Microsoft Fabric ecosystem. It reflects a commitment to code-first engineering, rigorous governance, and high-performance data delivery.

About

This repository contains a set of three projects that showcase modern data engineering, reporting, and governance capabilities using Microsoft Fabric. The work is structured to represent the responsibilities of a Senior / Lead Analytics Engineer within a large energy company operating at scale.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages