Skip to content

Commit 0899c45

Browse files
authored
Merge pull request #47969 from TheJamesHerring/DP-700-Ignite-orchestrate-processes-in-fabric
DP700-Ignite-new-module-
2 parents 9049048 + 72ccb67 commit 0899c45

21 files changed

+344
-0
lines changed
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.orchestrate-processes-in-fabric.introduction
3+
title: Introduction
4+
metadata:
5+
title: Introduction
6+
description: "Introduction"
7+
ms.date: 11/14/2024
8+
author: jamesh
9+
ms.author: jamesh
10+
ms.topic: unit
11+
ms.custom:
12+
- DP-700
13+
durationInMinutes: 3
14+
content: |
15+
[!include[](includes/1-introduction.md)]
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.orchestrate-processes-in-fabric.choose-between-pipeline-notebook
3+
title: Describe Microsoft Fabric Real-Time Intelligence?
4+
metadata:
5+
title: Describe Microsoft Fabric Real-Time Intelligence?
6+
description: "What is Microsoft Fabric Real-Time Intelligence?"
7+
ms.date: 11/14/2024
8+
author: jamesh
9+
ms.author: jamesh
10+
ms.topic: unit
11+
ms.custom:
12+
- DP-700
13+
durationInMinutes: 10
14+
content: |
15+
[!include[](includes/2-choose-between-pipeline-notebook.md)]
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.orchestrate-processes-in-fabric.design-schedules-and-event-based-triggers
3+
title: Design schedules and event based triggers
4+
metadata:
5+
title: Design schedules and event based triggers
6+
description: "Design pipeline schedules and event based triggers."
7+
ms.date: 11/14/2024
8+
author: jamesh
9+
ms.author: jamesh
10+
ms.topic: unit
11+
ms.custom:
12+
- DP-700
13+
durationInMinutes: 15
14+
content: |
15+
[!include[](includes/3-design-schedules-and-event-based-triggers.md)]
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.orchestrate-processes-in-fabric.exercise-implement-dynamic-patterns-to-notebooks
3+
title: Implement and schedule a dynamic notebook in a fabric pipeline
4+
metadata:
5+
title: Implement and schedule a dynamic notebook in a fabric pipeline
6+
description: "Implement and schedule a dynamic notebook in a fabric pipeline."
7+
ms.date: 11/14/2024
8+
author: jamesh
9+
ms.author: jamesh
10+
ms.topic: unit
11+
ms.custom:
12+
- DP-700
13+
durationInMinutes: 15
14+
content: |
15+
[!include[](includes/4-exercise-implement-dynamic-patterns-to-notebooks.md)]
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.orchestrate-processes-in-fabric.knowledge-check
3+
title: Knowledge Check
4+
metadata:
5+
title: Knowledge Check
6+
description: "Knowledge check"
7+
ms.date: 11/14/2024
8+
author: jamesh
9+
ms.author: jamesh
10+
ms.topic: unit
11+
ms.custom:
12+
- DP-700
13+
durationInMinutes: 3
14+
quiz:
15+
title: ""
16+
questions:
17+
- content: "What is a key feature of Dataflows in Microsoft Fabric?"
18+
choices:
19+
- content: "Dataflows only support data ingestion from a limited number of sources."
20+
isCorrect: false
21+
explanation: "Incorrect. Dataflows can ingest data from hundreds of sources."
22+
- content: "Dataflows offer a low-code interface to ingest and transform data from numerous sources."
23+
isCorrect: true
24+
explanation: "Correct. Dataflows provide a low-code interface for data ingestion from various sources and transformation using over 300 data transformations."
25+
- content: "Dataflows can't be run manually or on a schedule."
26+
isCorrect: false
27+
explanation: "Incorrect. Dataflows can be run manually, on a schedule, or as part of a data pipeline."
28+
- content: "What is the purpose of a storage event trigger in Fabric Data Factory pipelines?"
29+
choices:
30+
- content: "To automate data pipelines based on events occurring in storage accounts."
31+
isCorrect: true
32+
explanation: "Correct. Storage event triggers in Fabric Data Factory pipelines are used to automate data pipelines when specific events occur in storage accounts."
33+
- content: "To manually start a data pipeline run."
34+
isCorrect: false
35+
explanation: "Incorrect. Storage event triggers are used to automate data pipelines based on events, not for manual initiation."
36+
- content: "To schedule regular data pipeline runs."
37+
isCorrect: false
38+
explanation: "Incorrect. While scheduling is a feature of Fabric Data Factory, it isn't the function of storage event triggers."
39+
- content: "What is required to build an Eventhouse that contains a KQL database?"
40+
choices:
41+
- content: "A Microsoft Fabric trial license with the Fabric preview enabled in your tenant."
42+
isCorrect: true
43+
explanation: "Correct. A Microsoft Fabric trial license with the Fabric preview enabled in your tenant is required to build an Eventhouse that contains a KQL database."
44+
- content: "A Microsoft Office 365 subscription."
45+
isCorrect: false
46+
explanation: "Incorrect. A Microsoft Office 365 subscription alone isn't sufficient to build an Eventhouse with a KQL database."
47+
- content: "A Google Cloud trial license with the Fabric previews enabled."
48+
isCorrect: false
49+
explanation: "Incorrect. Google Cloud trial license isn't required for building an Eventhouse with a KQL database."
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.orchestrate-processes-in-fabric.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: "Summary"
7+
ms.date: 11/14/2024
8+
author: jamesh
9+
ms.author: jamesh
10+
ms.topic: unit
11+
ms.custom:
12+
- DP-700
13+
durationInMinutes: 1
14+
content: |
15+
[!include[](includes/6-summary.md)]
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
This module covers various units that provide an overview of some of the data orchestration and data movement technologies within Microsoft Fabric. It provides an overview of the scheduling and event-based triggering capabilities, which can help data engineers design pro-active systems to support the rapidly changing data needs of business.
2+
3+
## Pipelines in Microsoft Fabric
4+
5+
Data Factory provides a modern data integration experience to ingest, prepare, and transform data from various sources like databases, data warehouses, Lakehouse, and real-time data. It supports both citizen and professional developers with intelligent transformations and a rich set of activities. Users can create pipelines to execute multiple activities, access data sources through linked services, and add triggers to automate processes. Data Factory in Microsoft Fabric introduces Fast Copy capabilities for rapid data movement between data stores, enabling efficient data transfer to Lakehouse and Data Warehouse for analytics.
6+
7+
## Notebooks in Microsoft Fabric
8+
9+
Notebooks in Microsoft Fabric offer a versatile environment for data exploration, transformation, and analysis. They support various programming languages, including Python, KQL, and SQL, and provide an interactive interface for running code, visualizing data, and documenting workflows. Key features include:
10+
11+
## Schedules and Triggers in Microsoft Fabric
12+
13+
Scheduling pipelines includes three standard types of scheduling, including time-based, event-based, and custom scheduling. Scheduling in Microsoft Fabric also allows for seamless integration with other services, which allow for more dynamic event-driven pipelines.
14+
15+
The articles covered in this module include:
16+
17+
- Understanding pipelines in Microsoft Fabric.
18+
- Understanding notebooks in Microsoft Fabric.
19+
- Event-based triggers and scheduling in Microsoft Fabric.
20+
- practice some dynamic features of Microsoft Fabric notebooks
21+
22+
Students learn how Data Factory in Microsoft Fabric is used for modern data integration, including ingesting, preparing, and transforming data from various sources. They understand how to create pipelines, automate processes with triggers, and utilize Fast Copy for rapid data movement to Lakehouse and Data Warehouse. Additionally, they explore Notebooks in Microsoft Fabric for interactive data exploration, multi-language support, data visualization, and collaboration. The training provides an overview of integration with other Microsoft Fabric services, event-based triggers, scheduling, and dynamic features of notebooks, providing an understanding of data workflows within Microsoft Fabric.
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
2+
## Data Factory Pipelines in Microsoft Fabric
3+
4+
Data Factory provides a modern way to integrate data, allowing you to collect, prepare, and transform data from various sources like databases, data warehouses, Lakehouse, real-time data, and more. Whether you're a beginner or an experienced developer, you can use intelligent transformations and a wide range of activities to process your data.
5+
6+
### Dataflows overview
7+
8+
Dataflows offer a low-code interface to ingest data from numerous sources and transform it using over 300 data transformations. The transformed data can be loaded into various destinations, such as Azure SQL databases. Dataflows can be run manually, on a schedule, or as part of a data pipeline.
9+
10+
### Important features of Dataflows
11+
12+
- **Low-Code Interface**: Ingest data from hundreds of sources.
13+
- **Transformations**: Utilize 300+ data transformations.
14+
- **Destinations**: Load data into multiple destinations like Azure SQL databases.
15+
- **Execution**: Run dataflows manually, on a schedule, or within a data pipeline.
16+
17+
### Power Query integration
18+
19+
Dataflows are built using the Power Query experience, available across Microsoft products like Excel, Power BI, and Power Platform. Power Query enables users, from beginners to professionals, to perform data ingestion and transformations with ease. It supports joins, aggregations, data cleansing, custom transformations, and more, all through a user-friendly, visual, low-code interface.
20+
21+
### Real-World Uses Cases for Dataflows
22+
23+
**Data Consolidation for Reporting**:
24+
Organizations often have data spread across multiple sources such as databases, cloud storage, and on-premises systems. Dataflows can be used to consolidate this
25+
data into a single, unified dataset, which can then be used for reporting and analytics. For example, a company might use Dataflows to combine sales data from different regions into a single dataset for a comprehensive sales report. This single dataset can be further curated and promoted into a semantic model for use by a larger audience.
26+
27+
**Data Preparation for Machine Learning**:
28+
Dataflows can be used to prepare and clean data for machine learning models. This method includes tasks such as data cleansing, transformation, and feature engineering. For instance, a data science team might use Dataflows to preprocess customer data, removing duplicates and normalizing values before feeding it into a machine learning model.
29+
30+
**Real-Time Data Processing**:
31+
Dataflows can handle real-time data ingestion and transformation, making them ideal for scenarios where timely data processing is crucial. For example, an e-commerce platform might use Dataflows to process real-time transaction data, updating inventory levels and generating real-time sales reports.
32+
33+
**Data Migration**:
34+
When migrating data from legacy systems to modern platforms, Dataflows can be used to extract, transform, and load (ETL) data into the new system. This process ensures that data is accurately and efficiently transferred, minimizing downtime and data loss. For instance, a company migrating from an on-premises database to Azure SQL Database might use Dataflows to handle the data migration process.
35+
36+
**Self-Service Data Preparation**:
37+
Dataflows provide a low-code interface that allows business users to prepare their own data without needing extensive technical knowledge. This approach empowers users to create their own dataflows for tasks such as data cleansing, transformation, and enrichment, reducing the dependency on IT teams. For example, a marketing team might use Dataflows to prepare campaign data for analysis.
38+
39+
These use cases demonstrate the flexibility and power of Dataflows in handling various data integration and transformation task and show a powerful self-service feature. Self-service might be more appealing to your organization's business users while still providing a roadmap to a larger ELT project that utilizes pipelines and notebooks.
40+
41+
### Data Pipelines
42+
43+
Data pipelines offer powerful workflow capabilities at cloud-scale, enabling you to build complex workflows that can refresh your dataflow, move petabyte-sized data, and define sophisticated control flow pipelines.
44+
45+
### Important features of data Pipelines
46+
47+
- **Complex Workflows**: Build workflows that can refresh dataflows, move large volumes of data, and define control flow pipelines.
48+
- **ETL and Data Factory Workflows**: Create complex ETL (Extract, Transform, Load) and data factory workflows which perform various tasks at scale.
49+
- **Control Flow Capabilities**: Utilize built-in control flow features to build workflow logic with loops and conditionals.
50+
51+
### End-to-End ETL Data Pipeline
52+
53+
Combine a configuration-driven copy activity with your low-code dataflow refresh in a single pipeline for a complete ETL data pipeline. You can also add code-first activities for Spark Notebooks, SQL scripts, stored procedures, and more.
54+
55+
## Notebooks in Microsoft Fabric
56+
57+
- **Interactive Data Exploration:** Notebooks allow users to interactively explore and analyze data, making it easier to understand and manipulate datasets.
58+
- **Multi-language Support:** Users can write and execute code in multiple languages within the same notebook, enhancing flexibility and collaboration.
59+
- **Visualization:** Notebooks support rich data visualization, enabling users to create charts, graphs, and other visual representations of data.
60+
- **Collaboration:** Notebooks facilitate collaboration by allowing multiple users to work on the same document simultaneously, share insights, and track changes.
61+
- **Integration with Fabric Services:** Notebooks seamlessly integrate with other Microsoft Fabric services, such as Data Factory, Synapse Data Engineering, and Synapse Data Science. This approach provides a unified platform for end-to-end data workflows.
62+
63+
When comparing these technologies, it's important to note that while Data Factory focuses on data integration and pipeline automation, notebooks in Microsoft Fabric provide an interactive and ***collaborative*** environment for data exploration, documentation, transformation, and analysis. Both tools complement each other, offering a comprehensive solution for managing and analyzing data within the Microsoft Fabric ecosystem.
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
## Scheduling and event-based triggers
2+
3+
Within Fabric, you have several options that include both traditional and proactive in nature to schedule jobs. Most are even-driven and we see with items, such as pipeline runs which have on-demand and scheduled ***runs***. The more proactive ***runs*** are triggered by an event, such as a file arriving in a location and can allow your data and information to be processed based on your business needs and events rather than having to wait for the next schedule to run.
4+
5+
## Pipeline runs
6+
7+
A **data pipeline run** happens when you start a data pipeline. This action means all the tasks in the pipeline are carried out until they're finished.
8+
9+
For example, if you have a task to **copy data**, running the pipeline performs that task and copy your data. Each time you run a pipeline, it gets a special identifier called a **pipeline run ID**.
10+
11+
You can start a data pipeline in two ways:
12+
13+
1. **On-demand**: You manually trigger it whenever you need it.
14+
2. **Scheduled**: You set it to run automatically at specific times and frequencies that you choose.
15+
16+
### On-demand pipeline runs
17+
18+
Just as you would expect, on-demand, or ad-hoc runs happen by browsing to the specific pipeline and selecting the **Run** button. You're prompted to save your changes, but the pipeline receives a ***pipeline run ID*** and you can then view the status of the ***run*** by selecting the ***Output*** tab.
19+
20+
[![Screenshot of pipeline runs with on-demand action and their output.](../media/pipeline-run-output-jobs.png)](../media/pipeline-run-output-jobs-expanded.png#lightbox)
21+
22+
### Scheduling pipeline runs
23+
24+
When you schedule a data pipeline run, you can choose the frequency at which your pipeline operates.
25+
26+
1. **Select Schedule**:
27+
- This option is found in the top banner of the **Home** tab to view your scheduling options.
28+
29+
[![Screenshot of pipeline runs scheduling button to build schedule.](../media/pipeline-scheduling.png)](../media/pipeline-scheduling-expanded.png#lightbox)
30+
31+
1. **Default Setting**:
32+
- By default, your data pipeline has no schedule defined.
33+
- Select the **On** radio button under the **Scheduled Run** header
34+
35+
[![screenshot of pane for setting pipeline schedule settings.](../media/pipeline-schedule-settings.png)](../media/pipeline-schedule-settings-expanded.png#lightbox)
36+
37+
1. **Schedule Configuration**:
38+
- On the **Schedule configuration page**, you can specify:
39+
- **Schedule frequency**
40+
- **Start and end dates and times**
41+
- **Time zone**
42+
2. **Apply Your Schedule**:
43+
- Once you configure your settings, select **Apply** to set your schedule.
44+
3. **Editing Your Schedule**:
45+
- You can view or edit the schedule at any time by selecting the **Schedule** button again.
46+
47+
## Storage Event Triggers in Fabric Data Factory pipelines
48+
49+
Storage event triggers are a powerful feature in Data Factory pipelines that allow you to automate your data pipelines based on events occurring in your storage accounts.
50+
51+
## What Are Storage Event Triggers?
52+
53+
One of the most common scenarios for using event triggers is to activate a data pipeline when:
54+
55+
- **A file arrives**: This event action means a new file is added to your storage.
56+
- **A file is deleted**: This event occurrence indicates a file is removed from your storage.
57+
58+
[![Image of pane for setting storage event trigger.](../media/storage-event-trigger.png)](../media/storage-event-trigger-expanded.png#lightbox)
59+
60+
For users moving from Azure Data Factory (ADF) to Microsoft Fabric, it's common to work with events from Azure Data Lake Storage (ADLS) or Blob storage. If you're new to Fabric and ADF, you might be more familiar with file events from **OneLake**.
61+
62+
## How Triggers Work in Fabric
63+
64+
In Fabric Data Factory, triggers utilize advanced platform features, including:
65+
66+
- **Event Streams**: These triggers allow you to listen for specific events in real-time.
67+
- **Reflex Triggers**: These triggers are designed to respond quickly to events.
68+
69+
70+
### Creating a Trigger
71+
72+
To create a trigger in the Fabric Data Factory:
73+
74+
1. Open the **pipeline design canvas**.
75+
2. Look for the **Trigger** button, which allows you to create a Reflex trigger for your pipeline.
76+
3. Alternatively, you can create triggers directly from the [**Data Activator**](/fabric/data-activator/data-activator-get-started) experience.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
Now it's your chance to build and schedule a pipeline with a dynamic notebook in a pipeline activity.
2+
<!-- This link should still be valid after, but may need revisited prior to BUILD -->
3+
4+
> [!NOTE]
5+
> You need a Microsoft Fabric trial license with the Fabric preview enabled in your tenant. See [**Getting started with Fabric**](/fabric/get-started/fabric-trial) to enable your Fabric trial license.
6+
7+
Launch the exercise and follow the instructions.
8+
9+
[![Icon for Button to launch exercise.](../media/launch-exercise.png)](https://go.microsoft.com/fwlink/?linkid=2260721)

0 commit comments

Comments
 (0)