Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
94ef8db
best practices overview all workloads
brown9804 May 2, 2025
f5f0d14
Merge 94ef8db289d5e61f29720cb0fae27e9f058e5b98 into 390439cde3ad0430b…
brown9804 May 2, 2025
d8fc9d1
missing | table
brown9804 May 2, 2025
b144326
Merge d8fc9d1ad61a0f3d258ee1cccb21f31e236e5abf into 390439cde3ad0430b…
brown9804 May 2, 2025
6e48349
cleaning
brown9804 May 2, 2025
962b839
Merge 6e48349cb54a90987bf3af26cdca0c7f8472874e into 390439cde3ad0430b…
brown9804 May 2, 2025
dee82f1
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
34fe74e
data warehouse best practices init
brown9804 May 2, 2025
f0bf0c8
Merge 34fe74ec2d1f1ae202b53de03febaeb32c740db3 into 390439cde3ad0430b…
brown9804 May 2, 2025
330293f
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
e9aee72
data science best practices init
brown9804 May 2, 2025
ba3389e
Merge e9aee728355c48ffe4b8fa0e41d9a70411a27cda into 390439cde3ad0430b…
brown9804 May 2, 2025
6c8dee5
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
35baf05
real time intelligence init
brown9804 May 2, 2025
388e821
Merge 35baf056450b6ec4b5a61a0c8bed51dc60e97137 into 390439cde3ad0430b…
brown9804 May 2, 2025
a74e2b7
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
0e72d10
general copilot best practices
brown9804 May 2, 2025
53a7526
Merge 0e72d1087b4b9c8e30f0a1e7b09ec13830f702a6 into 390439cde3ad0430b…
brown9804 May 2, 2025
727ba0f
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
f11becd
purview best practices in fabric
brown9804 May 2, 2025
6a7f3b8
Merge f11becd14ddd91f8a63b2ae88fbda8996f326e90 into 390439cde3ad0430b…
brown9804 May 2, 2025
cabaa6c
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
b4b5fb9
onelake best practices
brown9804 May 2, 2025
5e12ca4
Merge b4b5fb9fc769c9c33cd6e2d5ed1dc5ea62256810 into 390439cde3ad0430b…
brown9804 May 2, 2025
d363089
Fix Markdown syntax issues
github-actions[bot] May 2, 2025
2b6381f
mved
brown9804 May 2, 2025
6491baf
moved
brown9804 May 2, 2025
b4d0267
Merge 6491baf000ba76d490035f3bde4c14a5f20c9187 into 390439cde3ad0430b…
brown9804 May 2, 2025
dfb9bc2
purview for fabric
brown9804 May 2, 2025
a5a3744
Merge dfb9bc28f35f6cedc4af7e4cb246e1fd5865fdc3 into 390439cde3ad0430b…
brown9804 May 2, 2025
fec82d4
data warehouse in progres
brown9804 May 2, 2025
7383699
Merge fec82d4910eb8e1b75f07a588ce37992ce92c1ed into 390439cde3ad0430b…
brown9804 May 2, 2025
d40a955
visual guindace of workload
brown9804 May 3, 2025
580ac79
Merge d40a95539cb03c769ae2280595e794f84d902552 into 390439cde3ad0430b…
brown9804 May 3, 2025
907af15
Update last modified date in Markdown files
github-actions[bot] May 3, 2025
1bddb81
adf visual guindace
brown9804 May 3, 2025
553afca
Merge 1bddb81495878fce431c8afd3ad0ed7ca1322c9c into 390439cde3ad0430b…
brown9804 May 3, 2025
0532870
mirroring in progress
brown9804 May 3, 2025
8e000e3
Merge 05328706ef2d815b421068efade89a8e66f36f6b into 390439cde3ad0430b…
brown9804 May 3, 2025
5e4826d
Update last modified date in Markdown files
github-actions[bot] May 3, 2025
710f7fa
Update BestPractices.adding medallion arch
brown9804 May 3, 2025
5dd5ff5
Merge 710f7fa3d63fab916ad832eab4cd0d0e35dca099 into 390439cde3ad0430b…
brown9804 May 3, 2025
47c6815
adding medallion arch
brown9804 May 3, 2025
d10cdff
Merge 47c6815c838d643a807997d2d0780658390b8ec4 into 390439cde3ad0430b…
brown9804 May 3, 2025
1522998
Fix Markdown syntax issues
github-actions[bot] May 3, 2025
5d3b811
je wrong spelling
brown9804 May 3, 2025
b136f12
Merge 5d3b811cf155bd28efb38a52281eb07ce7a5678e into 390439cde3ad0430b…
brown9804 May 3, 2025
d6aaf8d
+ medallion arch
brown9804 May 3, 2025
dcbfd6c
Merge d6aaf8de574e742c3ae28ae739f5e5531993cd92 into 390439cde3ad0430b…
brown9804 May 3, 2025
0a9b886
Update last modified date in Markdown files
github-actions[bot] May 3, 2025
1c78f8c
Fix notebook format issues
github-actions[bot] May 3, 2025
6d5a867
medallion arch added
brown9804 May 3, 2025
3dc1245
Merge 6d5a8678e26a50f2ccba16327435b746658b6bdb into 390439cde3ad0430b…
brown9804 May 3, 2025
0fc5674
ref in place
brown9804 May 3, 2025
8126ea5
Merge 0fc5674496fb54714c604ee88dbec72f56b0534c into 390439cde3ad0430b…
brown9804 May 3, 2025
170b97b
warehouse is ready
brown9804 May 3, 2025
fada8e9
Merge 170b97bac146e3c7f26438e00346bdb0c18aaf02 into 390439cde3ad0430b…
brown9804 May 3, 2025
cdc674d
format pipeline mkd
brown9804 May 3, 2025
defbd5a
Merge cdc674d43601baa366a4fbbd456f03a511bacd27 into 390439cde3ad0430b…
brown9804 May 3, 2025
ef39076
Fix Markdown syntax issues
github-actions[bot] May 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Costa Rica
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-05-02
Last updated: 2025-05-03

------------------------------------------

Expand Down Expand Up @@ -147,7 +147,7 @@ From [Microsoft Documentation](https://learn.microsoft.com/pt-br/fabric/fundamen
4. **You want to empower data consumers** (analysts, scientists, engineers) to discover and understand data assets easily.
5. **You are scaling your data operations** and need consistent governance policies across teams and projects.

Click to read more about [Microsoft Purview for Fabric - Overview](./Purview-Fabric.md).
Click to read more about [Microsoft Purview for Fabric - Overview](./Workloads-Specific/Purview/PurviewforFabric.md).

## Networking

Expand Down Expand Up @@ -201,13 +201,13 @@ Click to read more about [Microsoft Purview for Fabric - Overview](./Purview-Fab

- [Azure Data Factory (ADF) - Best Practices Overview](./Workloads-Specific/DataFactory/BestPractices.md)
- [Data Engineering - Best Practices Overview](./Workloads-Specific/DataEngineering/BestPractices.md)
- [Data Warehouse - Best Practices Overview]() - in progress
- [Data Science - Best Practices Overview]() - in progress
- [Real-Time Intelligence - Best Practices Overview]() - in progress
- [Data Warehouse - Best Practices Overview](./Workloads-Specific/DataWarehouse/BestPractices.md)
- [Data Science - Best Practices Overview](./Workloads-Specific/DataScience/BestPractices.md) - in progress
- [Real-Time Intelligence - Best Practices Overview](./Workloads-Specific/RealTimeIntelligence/BestPractices.md) - in progress
- [Power Bi - Best Practices Overview](./Workloads-Specific/PowerBi/BestPractices.md)
- [Copilot - Best Practices Overview]() - in progress
- [Purview - Best Practices Overview]() - in progress
- [OneLake - Best Practices Overview]() - in progress
- [Copilot - Best Practices Overview](./Workloads-Specific/Copilot/BestPractices.md) - in progress
- [Purview - Best Practices Overview](./Workloads-Specific/Purview/BestPractices.md) - in progress
- [OneLake - Best Practices Overview](./Workloads-Specific/OneLake/BestPractices.md) - in progress

<div align="center">
<h3 style="color: #4CAF50;">Total Visitors</h3>
Expand Down
21 changes: 21 additions & 0 deletions Workloads-Specific/Copilot/BestPractices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copilot - Best Practices Overview

Costa Rica

[![GitHub](https://badgen.net/badge/icon/github?icon=github&label)](https://github.com)
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-05-03

----------

<details>
<summary><b>List of References</b> (Click to expand)</summary>

</details>

<div align="center">
<h3 style="color: #4CAF50;">Total Visitors</h3>
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
</div>
4 changes: 2 additions & 2 deletions Workloads-Specific/DataEngineering/BestPractices.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Costa Rica
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-05-02
Last updated: 2025-05-03

----------

Expand Down Expand Up @@ -61,7 +61,7 @@ Last updated: 2025-05-02
- **Comprehensive Schema Documentation:** Create detailed, auto-generated documentation for every endpoint; include sample queries, expected responses, and precise error messages to aid developer understanding.
- **Robust Error Handling:** Implement consistent, informative error responses and integrate thorough test suites to guarantee smooth operation and backward compatibility as the API evolves.

https://github.com/user-attachments/assets/8971651d-9aff-4b41-94ca-9a35b9241f22
<https://github.com/user-attachments/assets/8971651d-9aff-4b41-94ca-9a35b9241f22>

<div align="center">
<h3 style="color: #4CAF50;">Total Visitors</h3>
Expand Down
6 changes: 5 additions & 1 deletion Workloads-Specific/DataFactory/BestPractices.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Costa Rica
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-05-02
Last updated: 2025-05-03

----------

Expand Down Expand Up @@ -56,6 +56,9 @@ Last updated: 2025-05-02

</details>

<div align="center">
<img src="https://github.com/user-attachments/assets/658689cd-f045-491f-996c-e64e4008acd1" alt="Centered Image" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
</div>

## Clear Pipeline Structure

Expand Down Expand Up @@ -364,6 +367,7 @@ graph TD
## Source Control

> Benefits of Git Integration: <br/>
>
> - **Version Control**: Track and audit changes, and revert to previous versions if needed. <br/>
> - **Collaboration**: Multiple team members can work on the same project simultaneously. <br/>
> - **Incremental Saves**: Save partial changes without publishing them live. <br/>
Expand Down
21 changes: 21 additions & 0 deletions Workloads-Specific/DataScience/BestPractices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Data Science - Best Practices Overview

Costa Rica

[![GitHub](https://badgen.net/badge/icon/github?icon=github&label)](https://github.com)
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-05-03

----------

<details>
<summary><b>List of References</b> (Click to expand)</summary>

</details>

<div align="center">
<h3 style="color: #4CAF50;">Total Visitors</h3>
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
</div>
89 changes: 89 additions & 0 deletions Workloads-Specific/DataWarehouse/BestPractices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Data Warehouse - Best Practices Overview

Costa Rica

[![GitHub](https://badgen.net/badge/icon/github?icon=github&label)](https://github.com)
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-05-03

----------

> Ensure that your data warehouse solution is engineered for scalability, resilience, and efficient integration of diverse data sources. Every component (from the core warehouse to mirrored databases) should adhere to strict best practices for structure, documentation, and management, ensuring long-term maintainability and robust disaster recovery.
<details>
<summary><b>List of References</b> (Click to expand)</summary>

- [Ingest data into the Warehouse](https://learn.microsoft.com/en-us/fabric/data-warehouse/ingest-data)
- [Performance guidelines in Fabric Data Warehouse](https://learn.microsoft.com/en-us/fabric/data-warehouse/guidelines-warehouse-performance)

</details>

<details>
<summary><b>Table of Content</b> (Click to expand)</summary>

- [Sample Warehouse Environment](#sample-warehouse-environment)
- [Structured Warehouse Implementation](#structured-warehouse-implementation)
- [Interactive Notebooks for Data Warehousing](#interactive-notebooks-for-data-warehousing)
- [Using Mirroring to Your Benefit](#using-mirroring-to-your-benefit)

</details>

<div align="center">
<img src="https://github.com/user-attachments/assets/47c01e2a-48aa-4bc5-9a0f-fd2630618687" alt="Centered Image" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
</div>

## Sample Warehouse Environment

> Develop an isolated sample warehouse to prototype, test, and train on the data warehouse structure. This environment mimics the production warehouse architecture but contains a representative subset of data. Its purpose is to validate new queries, ETL routines, and performance tuning while insulating production operations from potential disruptions. You can deploy a sample warehouse using anonymized or synthetic data. For example, use a smaller, mirrored version of the production warehouse structure to experiment with SQL queries, develop new ETL pipelines, or train team members without impacting live data and processes.
<https://github.com/user-attachments/assets/acaecdd1-e81c-4e3a-b14a-db054f700f3e>

## Structured Warehouse Implementation

> Build a robust, centralized data warehouse that organizes data into well-defined layers (often referred to as Bronze, Silver, and Gold). Layering the data warehouse ensures fast query performance, streamlined management, and strong governance. Leverage proper indexing, partitioning schemes, metadata tagging, and lineage tracking to support compliance and facilitate troubleshooting.
Create a warehouse solution that segments data as follows:

- Bronze Layer: Ingests raw, untransformed data maintaining source fidelity.
- Silver Layer: Applies data cleansing, validation, and enrichment.
- Gold Layer: Produces analytics-ready data using optimized storage formats like Parquet or Delta Lake, with partitioning by date or region. Integrate metadata catalogs and RBAC controls for added governance.

> Here is a [reference of a medallion architecture using only Fabric](./Medallion_Architecture/). If you need to handle `complex data transformations and large-scale data processing`, you can use our combined solution of **Fabric + Databricks**. This powerful combination leverages the strengths of both platforms to provide a robust data processing pipeline. This workshop on [Fabric with Databricks for Data Analytics](https://microsoft.github.io/TechExcel-Fabric-with-Databricks-for-Data-Analytics/) offers a comprehensive step-by-step guide on developing Medallion Architecture using Fabric and Databricks. <br/>
| Medallion Architecture using only Fabric | Medallion Architecture Fabric + Databricks |
| --- | --- |
| <img width="550" alt="image" src="https://github.com/user-attachments/assets/b4394d54-9bb0-453b-abf8-cfaaa8e532d2" /> | <img width="550" alt="image" src="https://github.com/user-attachments/assets/c866098c-ffd1-4438-bc77-565786c91601"> |

## Interactive Notebooks for Data Warehousing

> Use interactive notebooks as exploratory and documentation tools for your warehouse operations. These notebooks serve as an effective interface for testing queries, performing data analysis, and capturing transformation logic. Rich markdown annotations, code segmentation, and version control increase collaboration while ensuring reproducibility across the team.
Create notebooks that are segmented into distinct sections:

- Data Loading: Scripts to pull data from the warehouse.
- Data Transformation: Blocks that illustrate cleaning and enrichment steps.
- Analysis & Visualization: SQL queries and charts generated from warehouse data, supplemented with detailed markdown explanations and inline comments to clarify business logic.

## Using Mirroring to Your Benefit

> Mirroring offers a modern, efficient way to continuously and seamlessly access and ingest data from operational databases or data warehouses. It works by replicating a snapshot of the source database into OneLake, and then keeping that replica in near real-time sync with the original. This ensures that your data is always up to date and readily available for analytics or downstream processing. `As part of the value offering, each Fabric compute SKU includes a built-in allowance of free Mirroring storage, proportional to the compute capacity you provision. For example, provisioning an F64 SKU grants you 64 terabytes of free Mirroring storage. You only begin incurring OneLake storage charges if your mirrored data exceeds this free limit or if the compute capacity is paused.` Click [here](https://azure.microsoft.com/en-us/pricing/details/microsoft-fabric/?msockid=38ec3806873362243e122ce086486339) to read more about it.
<div align="center">
<img src="https://github.com/user-attachments/assets/ed868665-1823-42ff-9cd7-d0ee3310c184" alt="Centered Image" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
</div>

| **Mirroring Option** | Details |
|--------------------------------------------------|--------------------|
| **Mirrored Azure SQL Database** | Configure a mirrored Azure SQL Database with geo-redundancy and automatic failover. For example, use Azure’s built-in replication to maintain a secondary copy that seamlessly takes over during primary instance outages, ensuring continuous data availability. |
| **Mirrored Snowflake** | Deploy a Snowflake mirror by setting up data replication between your primary instance and a secondary environment. Regularly validate synchronization and monitor rollback capabilities to confirm that the mirror remains current and can support operations during failover or testing cycles. |
| **Mirrored Azure Cosmos DB** | Configure an Azure Cosmos DB mirroring setup in preview mode that replicates data across multiple regions. Test the environment by simulating high-load queries and failover events to ensure that global access is maintained with minimal latency. |
| **Mirrored Azure Database for PostgreSQL** | Set up a mirrored Azure Database for PostgreSQL in its preview configuration. Create read replicas with continuous synchronization, perform failover drills, and track replication latency to guarantee that the mirrored instance maintains data integrity and high availability during operational stress. |
| **Mirrored Azure SQL Managed Instance** | Configure an Azure SQL Managed Instance in a mirrored setup using strategies like log shipping or transactional replication. Monitor key performance metrics to ensure that replication latency is minimal, and the mirror is capable of supporting a swift transition during outages or maintenance windows. |
| **Mirrored Database** | Set up a mirrored database configuration that synchronizes periodically with a primary instance. Schedule automated tests and synchronization checks, and simulate failover events to validate that the data remains consistent, with built-in alerts and monitoring demonstrating the mirror’s readiness for production use. |

<div align="center">
<h3 style="color: #4CAF50;">Total Visitors</h3>
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
</div>
Loading