Skip to content

Commit c3fc909

Browse files
authored
Merge 9127b78 into 1232088
2 parents 1232088 + 9127b78 commit c3fc909

File tree

2 files changed

+38
-1
lines changed

2 files changed

+38
-1
lines changed

Workloads-Specific/DataScience/BestPractices.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,45 @@ Last updated: 2025-05-03
1313
<details>
1414
<summary><b>List of References</b> (Click to expand)</summary>
1515

16+
- [What is Data Science in Microsoft Fabric?](https://learn.microsoft.com/en-us/fabric/data-science/data-science-overview)
17+
- [Data Science documentation in Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/data-science/)
18+
1619
</details>
1720

21+
<details>
22+
<summary><b>Table of Content</b> (Click to expand)</summary>
23+
24+
- [ML Model Management](#ml-model-management)
25+
- [Experiment Tracking & Management](#experiment-tracking--management)
26+
- [Reproducible Environments](#reproducible-environments)
27+
- [Data Agent Preview Usage](#data-agent-preview-usage)
28+
29+
</details>
30+
31+
> Ensure that your data science workflows in Microsoft Fabric are built for rapid experimentation, efficient model management, and seamless deployment. Each element should be managed with clear versioning, detailed documentation, and reproducible environments, enabling a smooth transition from experimentation to production.
32+
33+
## ML Model Management
34+
35+
> Use model registries integrated within Fabric to store and version your models. Include a descriptive README, link relevant experiment IDs, and attach performance metrics such as accuracy, AUC, and confusion matrices. For example, link your production-ready model (v#.#) from a registered repository along with its associated validation metrics and deployment instructions.
36+
37+
## Experiment Tracking & Management
38+
39+
> Set up an experiment dashboard that automatically logs training runs. For instance, record runs with various hyperparameter combinations, tag them with unique identifiers, and visualize comparative metrics over multiple iterations. This dashboard can help you decide whether a model trained with early stopping or one with higher epochs best meets performance goals.
40+
41+
<https://github.com/user-attachments/assets/4c73eaaa-cf03-47cf-807b-69007c8df704>
42+
43+
## Reproducible Environments
44+
45+
> Create an environment file (e.g., Conda `environment.yml`) that lists all required Python packages and their versions. For example, specify TensorFlow 2.9, scikit-learn 1.0, and other dependencies so that every data scientist and deployment pipeline uses the exact setup. Use Microsoft Fabric workspaces to segregate development and production environments, ensuring that models are trained and evaluated in a consistent setting.
46+
47+
<https://github.com/user-attachments/assets/fcce754d-afd3-4267-aa0f-bba87c0a3089>
48+
49+
## Data Agent (Preview) Usage
50+
51+
> Integrate the Data Agent into your pipeline to automatically validate incoming datasets for completeness and consistency. For instance, set up rules that flag missing data or out-of-range values and trigger notifications when anomalies are detected. Track and document these incidents to help refine the agent’s calibration, ensuring that data passing to your experiments meets quality standards.
52+
53+
Click to read [Demonstration: Data Agents in Microsoft Fabric](./Workloads-Specific/DataScience/Data_Agents.md).
54+
1855
<div align="center">
1956
<h3 style="color: #4CAF50;">Total Visitors</h3>
2057
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>

Workloads-Specific/DataWarehouse/Medallion_Architecture/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Demostration: Medallion Architecture Overview
1+
# Demonstration: Medallion Architecture Overview
22

33
Costa Rica
44

0 commit comments

Comments
 (0)