Learn how to build a complete real-time analytics solution using Cosmos DB in Microsoft Fabric. This hands-on lab demonstrates how to create an operational data store, implement streaming data pipelines, build cross-database analytics, and deploy personalized recommendations using Reverse ETL patterns.
- Provision and configure Cosmos DB in Microsoft Fabric as an operational data store
- Implement real-time streaming using Eventstreams and KQL for POS transaction data
- Build cross-database analytics leveraging Cosmos DB's automatic mirroring to OneLake
- Create data warehouses and perform ETL operations from streaming to structured data
- Implement Reverse ETL patterns to update operational systems with analytical insights
- Deploy personalized recommendation models using machine learning and customer behavior data
| Resources | Links | Description |
|---|---|---|
| Lab Instructions | Lab Exercises | Step-by-step hands-on lab exercises |
| Sample Data | Data Files | NoSQL, relational, and streaming sample datasets |
| Notebooks | PySpark Notebooks | ML models for personalization and reverse ETL |
| Source Code | Code Samples | C# streaming applications and data loaders |
| Learn more about Cosmos DB in Fabric | https://learn.microsoft.com/fabric/database/cosmos-db/ | Cosmos DB integration with Microsoft Fabric |
| Cosmos DB in Microsoft Fabric Shorts | https://learn.microsoft.com/fabric/database/cosmos-db/ | Cosmos DB integration with Microsoft Fabric |
This lab implements a modern real-time analytics architecture using Microsoft Fabric:
graph TB
A[POS Systems] --> B[Eventstream]
B --> C[Eventhouse/KQL]
B --> D[Data Warehouse]
E[Customer Data] --> F[Cosmos DB]
F --> G[OneLake Mirror]
G --> H[Cross-DB Analytics]
D --> I[Reverse ETL]
I --> F
F --> J[ML Notebooks]
J --> K[Personalization Model]
K --> F
C --> L[Real-time Dashboard]
D --> M[BI Reports]
- Operational Layer: Cosmos DB stores customer profiles and transaction data
- Streaming Layer: Eventstreams capture real-time POS transactions
- Analytics Layer: Data Warehouse provides structured analytics storage
- Intelligence Layer: ML notebooks generate personalized recommendations
- Reverse ETL: Analytics insights flow back to operational systems
This hands-on lab consists of 5 progressive exercises:
| Exercise | Focus Area | Duration | Key Technologies |
|---|---|---|---|
| Fabric Environment Setup | Fabric Environment Setup | 10 min | Terminal, Microsoft Fabric |
| Exercise 1 | Provisioning Cosmos DB in Fabric | 15 min | Cosmos DB, NoSQL containers |
| Exercise 2 | Cross-Database Analytics | 20 min | SQL Endpoint, OneLake mirroring |
| Exercise 3 | Real-Time Streaming | 25 min | Eventstreams, KQL, Eventhouse |
| Exercise 4 | Reverse ETL & ML | 30 min | Data Warehouse, PySpark, Fabric Notebooks |
| Exercise 5 | Serve Personalized Recommendations from Cosmos DB | 20 min | C#, Cosmos DB in Fabric |
Total Lab Time: ~2 hours
- Access to Microsoft Fabric workspace with appropriate permissions
- Basic familiarity with SQL, NoSQL databases, and data analytics concepts
- Understanding of JSON data structures and REST APIs
- Clone this repository to your local development environment
- Access Microsoft Fabric and create a new workspace for this lab
- Start with Exercise 1 by following the Lab Instructions
- Use the provided sample data from the data folder
- Deploy notebooks and code from the src folder as needed
βββ π data/ # Sample datasets
β βββ π nosql/ # Customer data for Cosmos DB
β βββ π relational/ # Dimensional data for warehouse
β βββ π streaming/ # Real-time transaction generators
βββ π lab/ # Lab exercise instructions
β βββ π instructions/ # Step-by-step exercise guides
βββ π src/ # Source code and notebooks
β βββ π notebooks/ # PySpark ML notebooks
β βββ π warehouse_setup/ # C# data loading utilities
βββ π docs/ # Additional documentation
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit Contributor License Agreements.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.