Role: Lead Data Engineer & Strategic Data Analyst
Tech Stack & Architectural Decision: Microsoft Fabric (OneLake, PySpark, SQL Endpoint), Power BI (DirectQuery Mode, Prototype, Import Mode Production, Star Schema).
Architectural Rationale: The solution was initially prototyped in DirectQuery to evaluate the potential for real-time telemetry streaming from Microsoft Fabric. However, for the final production suite, I made the strategic decision to migrate to Import Mode. This transition was executed to leverage the VertiPaq in-memory engine, ensuring sub-second interactivity and high-concurrency performance for complex ESG and financial DAX measures that require maximum computational efficiency.
Core Impact: Unified siloed telemetry to recover €4.9M in avoidable efficiency costs.
🔗 View Live Power BI Dashboard
In the modern energy sector, data is often trapped in operational silos—telemetry, emissions tracking, and grid consumption exist in isolation. This platform was engineered within Microsoft Fabric to unify these streams into a Single Source of Truth, enabling data-driven decisions across three critical pillars:
- Operational Excellence: Optimizing plant load factors to reclaim 23.41M MWh of unused capacity.
- ESG Leadership: Real-time tracking of carbon intensity, maintaining a fleet average of 0.05 kg/MWh (Target: 0.10).
- Grid Resilience: Monitoring heat balance to sustain a 102.59% self-sufficiency rating.
By unifying disparate datasets, I identified critical operational risks that were previously buried in siloed telemetry:
| Metric | Result | Strategic Insight |
|---|---|---|
| Efficiency Risk | 81,744 MWh | Identified recorded excess waste across the fleet. |
| Financial Impact | €4,904,640 | Translated technical waste into a recovery narrative. |
| Primary Driver | Tampere Plant | Pinpointed Nuclear production as the leading risk contributor. |
| Fleet Utilization | 13.2% | Highlighted massive scaling opportunity without further CAPEX. |
The project is divided into three specialized workstreams/projects to demonstrate full-stack Fabric proficiency:
-
Design: Implemented a robust Star Schema to handle 100M+ rows of sensor data.
-
Innovation: Utilized Calculation Groups for dynamic "Actual vs. Budget" energy comparisons.
Executive Summary: An end-to-end Microsoft Fabric solution that refactors siloed plant operations data into strategic assets, translating technical grid waste into actionable financial insights and ESG leadership metrics.
- Architecture: Multi-layered Medallion (Bronze/Silver/Gold) architecture in OneLake.
- Transformation: Used PySpark for complex energy unit conversions and validation logic.
- Key Visuals:

- Security: Defined Row-Level Security (RLS) to ensure plant managers only see their specific telemetry.
- DevOps: Established a CI/CD workflow using Fabric Deployment Pipelines and GitHub integration.
-
Storage: Microsoft Fabric OneLake & Lakehouse (Delta Lake format).
-
Compute: Spark (PySpark Notebooks) & SQL Analytics Endpoints.
-
Orchestration: Fabric Data Pipelines.
-
Reporting: Power BI (Import Mode for performance, Star Schema architecture).
-
DevOps: GitHub Repository Integration & Fabric Deployment Pipelines.
-
Scalability: Design a platform capable of handling millions of rows of operational energy data.
-
Single Source of Truth: Compute complex KPIs upstream in Spark to ensure consistency across all downstream tools.
-
Governance: Implement a "Least Privilege" security model and a structured release process.
-
Performance: Balance the heavy lifting of Spark with the lightning-fast interactivity of Power BI.
energy-analytics-fabric-bi/
├── docs/ # Global Architecture & Governance Blueprints
├── projects/
│ ├── project1-energy-bi/ # PBIX metadata, DAX measures, Theme files
│ ├── project2-engineering/ # PySpark Notebooks, Pipeline JSON definitions
│ └── project3-governance/ # CI/CD configs & RLS security documentation
└── README.md # Portfolio Home Page
Together, these projects demonstrate a complete, production-aligned Fabric ecosystem including ingestion, transformation, modelling, reporting, governance, operations, and deployment automation.
A full business intelligence solution built using Power BI and Fabric semantic models. Includes data modelling using a Star Schema, calculation groups, and energy-sector KPIs (production, demand, emissions).
An end-to-end data engineering workflow using PySpark and Medallion Architecture. It covers Lakehouse-based ingestion, dimensional modeling, and pipeline orchestration with automated quality gates.
Illustrates enterprise management of Fabric: Workspace strategy (Dev/Test/Prod), RLS/OLS security models, and CI/CD using deployment pipelines and Git integration.
All architecture, governance, and design materials are located in /docs.
Key documents include:
These documents are written to reflect enterprise standards.
-
Start with the /docs folder to understand the architecture and governance design.
-
Explore Engineering: See Project 2 for the Spark ingestion and transformation pipelines.
-
Explore Analytics: See Project 1 for the BI solution and Star Schema.
-
Explore Operations: See Project 3 for governance and deployment artefacts.
This repository is a demonstration of architectural capability in the Microsoft Fabric ecosystem. It reflects a commitment to code-first engineering, rigorous governance, and high-performance data delivery.





