Skip to content

Commit 8f4e97a

Browse files
authored
Merge fc9f2eb into 66f1da1
2 parents 66f1da1 + fc9f2eb commit 8f4e97a

File tree

4 files changed

+137
-5
lines changed

4 files changed

+137
-5
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"cells":[{"cell_type":"code","source":["# Generates Dummy json file in Files/\n","\n","# Import necessary libraries\n","from pyspark.sql import SparkSession\n","from pyspark.sql.types import *\n","import random\n","from datetime import datetime, timedelta\n","\n","# Initialize Spark session (if not already initialized)\n","spark = SparkSession.builder.appName(\"GenerateRandomData\").getOrCreate()\n","\n","# Function to generate random data\n","def generate_random_data(num_entries):\n"," data = []\n"," for i in range(1, num_entries + 1):\n"," name = f\"User{i}\"\n"," entry = {\n"," \"id\": i,\n"," \"name\": name,\n"," \"age\": random.randint(18, 65),\n"," \"email\": f\"{name.lower()}@example.com\",\n"," \"created_at\": (datetime.now() - timedelta(days=random.randint(0, 365))).strftime(\"%Y-%m-%d %H:%M:%S\")\n"," }\n"," data.append(entry)\n"," return data\n","\n","# Generate 10 random entries\n","random_data = generate_random_data(10)\n","\n","# Define schema for the DataFrame\n","schema = StructType([\n"," StructField(\"id\", IntegerType(), True),\n"," StructField(\"name\", StringType(), True),\n"," StructField(\"age\", IntegerType(), True),\n"," StructField(\"email\", StringType(), True),\n"," StructField(\"created_at\", StringType(), True)\n","])\n","\n","# Create a DataFrame from the random data\n","df_random_data = spark.createDataFrame(random_data, schema=schema)\n","\n","# Write the DataFrame to the Lakehouse in the specified path\n","output_path = \"abfss://{WORKSPACE-NAME}@onelake.dfs.fabric.microsoft.com/raw_Bronze.Lakehouse/Files/random_data\" # Replace {WORKSPACE-NAME}\n","df_random_data.write.format(\"delta\").mode(\"overwrite\").save(output_path)\n","\n","print(f\"Random data has been saved to the Lakehouse at '{output_path}'.\")"],"outputs":[],"execution_count":null,"metadata":{"microsoft":{"language":"python","language_group":"synapse_pyspark"}},"id":"8d820f25-3c2e-45b3-8a08-af78f0d45e1d"}],"metadata":{"kernel_info":{"name":"synapse_pyspark"},"kernelspec":{"name":"synapse_pyspark","language":"Python","display_name":"Synapse PySpark"},"language_info":{"name":"python"},"microsoft":{"language":"python","language_group":"synapse_pyspark","ms_spell_check":{"ms_spell_check_language":"en"}},"nteract":{"version":"[email protected]"},"spark_compute":{"compute_id":"/trident/default","session_options":{"conf":{"spark.synapse.nbs.session.timeout":"1200000"}}},"dependencies":{}},"nbformat":4,"nbformat_minor":5}
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Microsoft Fabric: Automating Pipeline Execution with Activator
2+
3+
Costa Rica
4+
5+
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
6+
[brown9804](https://github.com/brown9804)
7+
8+
Last updated: 2025-04-15
9+
10+
----------
11+
12+
> This process shows how to set up Microsoft Fabric Activator to automate workflows by detecting file creation events in a storage system and triggering another pipeline to run. <br/>
13+
> 1. **First Pipeline**: The process starts with a pipeline that ends with a `Copy Data` activity. This activity uploads data into the `Lakehouse`. <br/>
14+
> 2. **Event Stream Setup**: An `Event Stream` is configured in Activator to monitor the Lakehouse for file creation or data upload events. <br/>
15+
> 3. **Triggering the Second Pipeline**: Once the event is detected (e.g., a file is uploaded), the Event Stream triggers the second pipeline to continue the workflow.
16+
17+
<details>
18+
<summary><b>List of References </b> (Click to expand)</summary>
19+
20+
- [Activate Fabric items](https://learn.microsoft.com/en-us/fabric/real-time-intelligence/data-activator/activator-trigger-fabric-items)
21+
- [Create a rule in Fabric Activator](https://learn.microsoft.com/en-us/fabric/real-time-intelligence/data-activator/activator-create-activators)
22+
23+
</details>
24+
25+
<details>
26+
<summary><b>List of Content </b> (Click to expand)</summary>
27+
28+
- [Set Up the First Pipeline](#set-up-the-first-pipeline)
29+
- [Configure Activator to Detect the Event](#configure-activator-to-detect-the-event)
30+
- [Set Up the Second Pipeline](#set-up-the-second-pipeline)
31+
- [Define the Rule in Activator](#define-the-rule-in-activator)
32+
- [Test the Entire Workflow](#test-the-entire-workflow)
33+
- [Troubleshooting If Needed](#troubleshooting-if-needed)
34+
35+
</details>
36+
37+
> [!NOTE]
38+
> This code generates random data with fields such as id, name, age, email, and created_at, organizes it into a PySpark DataFrame, and saves it to a specified Lakehouse path using the Delta format. Click here to see the [example script](./GeneratesRandomData.ipynb)
39+
40+
https://github.com/user-attachments/assets/95206bf3-83a7-42c1-b501-4879df22ef7d
41+
42+
## Set Up the First Pipeline
43+
44+
1. **Create the Pipeline**:
45+
- In [Microsoft Fabric](https://app.fabric.microsoft.com/), create the first pipeline that performs the required tasks.
46+
- Add a `Copy Data` activity as the final step in the pipeline.
47+
48+
2. **Generate the Trigger File**:
49+
- Configure the `Copy Data` activity to create a trigger file in a specific location, such as `Azure Data Lake Storage (ADLS)` or `OneLake`.
50+
- Ensure the file name and path are consistent and predictable (e.g., `trigger_file.json` in a specific folder).
51+
3. **Publish and Test**: Publish the pipeline and test it to ensure the trigger file is created successfully.
52+
53+
https://github.com/user-attachments/assets/798a3b12-c944-459d-9e77-0112b5d82831
54+
55+
## Configure Activator to Detect the Event
56+
57+
> [!TIP]
58+
> Event options:
59+
60+
https://github.com/user-attachments/assets/282fae9b-e1c6-490d-bd23-9ed9bdf6105d
61+
62+
1. **Set Up an Event**:
63+
- Create a new event to monitor the location where the trigger file is created (e.g., ADLS or OneLake). Click on `Real-Time`:
64+
65+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/e1ce1f83-a8f6-4a3c-94dc-749e370d8079" />
66+
67+
- Choose the appropriate event type, such as `File Created`.
68+
69+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/3a21abd7-0ff4-428f-a3a1-5e387314c1f5" />
70+
71+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/94e5556b-5d56-4a42-9edd-83b514e7c953" />
72+
73+
- Add a source:
74+
75+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/9709a690-f3b5-453b-b3d9-c67d4b1a9465" />
76+
77+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/8dcadd23-4abb-47ee-82ca-f3868cb818e1" />
78+
79+
https://github.com/user-attachments/assets/43a9654b-e8d0-44da-80b9-9f528483fa3b
80+
81+
2. **Test Event Detection**:
82+
- Save the event and test it by manually running the first pipeline to ensure Activator detects the file creation.
83+
- Check the **Event Details** screen in Activator to confirm the event is logged.
84+
85+
https://github.com/user-attachments/assets/6b21194c-54b4-49de-9294-1bf78b1e5acd
86+
87+
## Set Up the Second Pipeline
88+
89+
1. **Create the Pipeline**:
90+
- In Microsoft Fabric, create the second pipeline that performs the next set of tasks.
91+
- Ensure it is configured to accept external triggers.
92+
2. **Publish the Pipeline**: Publish the second pipeline and ensure it is ready to be triggered.
93+
94+
https://github.com/user-attachments/assets/5b630579-a0ec-4d5b-b973-d9b4fdd8254c
95+
96+
## Define the Rule in Activator
97+
98+
1. **Setup the Activator**:
99+
100+
https://github.com/user-attachments/assets/7c88e080-d5aa-4920-acd6-94c2e4ae0568
101+
102+
2. **Create a New Rule**:
103+
- In `Activator`, create a rule that responds to the event you just configured.
104+
- Set the condition to match the event details (e.g., file name, path, or metadata).
105+
3. **Set the Action**:
106+
- Configure the rule to trigger the second pipeline.
107+
- Specify the pipeline name and pass any required parameters.
108+
3. **Save and Activate**:
109+
- Save the rule and activate it.
110+
- Ensure the rule is enabled and ready to respond to the event.
111+
112+
https://github.com/user-attachments/assets/5f139eeb-bab0-4d43-9f22-bbe44503ed75
113+
114+
## Test the Entire Workflow
115+
116+
1. **Run the First Pipeline**: Execute the first pipeline and verify that the trigger file is created.
117+
2. **Monitor Activator**: Check the `Event Details` and `Rule Activation Details` in Activator to ensure the event is detected and the rule is activated.
118+
3. **Verify the Second Pipeline**: Confirm that the second pipeline is triggered and runs successfully.
119+
120+
https://github.com/user-attachments/assets/0a1dab70-2317-4636-b0be-aa0cb301b496
121+
122+
## Troubleshooting (If Needed)
123+
- If the second pipeline does not trigger:
124+
1. Double-check the rule configuration in Activator.
125+
2. Review the logs in Activator for any errors or warnings.
126+
127+
<div align="center">
128+
<h3 style="color: #4CAF50;">Total Visitors</h3>
129+
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
130+
</div>
131+

Monitoring-Observability.md renamed to Monitoring-Observability/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Costa Rica
66
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
77
[brown9804](https://github.com/brown9804)
88

9-
Last updated: 2024-11-28
9+
Last updated: 2025-04-15
1010

1111
----------
1212

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -136,10 +136,10 @@ Last updated: 2025-04-15
136136

137137
## Monitoring and Observability
138138

139-
- **Microsoft [Fabric Capacity Metrics](https://github.com/MicrosoftCloudEssentials-LearningHub/Fabric-EnterpriseFramework/blob/main/Monitoring-Observability.md#microsoft-fabric-capacity-metrics-app) app**: Used for monitoring and managing capacity metrics.
140-
- **Admin Monitoring**: Configure and use the [Admin Monitoring Workspace](https://github.com/MicrosoftCloudEssentials-LearningHub/Fabric-EnterpriseFramework/blob/main/Monitoring-Observability.md#admin-monitoring) for custom reporting on system performance and usage.
141-
- **Monitor Hub**: Access and utilize the [Monitor Hub](https://github.com/MicrosoftCloudEssentials-LearningHub/Fabric-EnterpriseFramework/blob/main/Monitoring-Observability.md#monitor-hub) for centralized log and metric monitoring, and extend activity history of the data platform.
142-
- **Event Hub Integration**: Use Event Hub to capture and analyze events for real-time monitoring. For example, leverage it for [automating pipeline execution with Activator]()
139+
- **Microsoft [Fabric Capacity Metrics](https://github.com/MicrosoftCloudEssentials-LearningHub/Fabric-EnterpriseFramework/blob/main/Monitoring-Observability/README.md#microsoft-fabric-capacity-metrics-app) app**: Used for monitoring and managing capacity metrics.
140+
- **Admin Monitoring**: Configure and use the [Admin Monitoring Workspace](https://github.com/MicrosoftCloudEssentials-LearningHub/Fabric-EnterpriseFramework/blob/main/Monitoring-Observability/README.md#admin-monitoring) for custom reporting on system performance and usage.
141+
- **Monitor Hub**: Access and utilize the [Monitor Hub](https://github.com/MicrosoftCloudEssentials-LearningHub/Fabric-EnterpriseFramework/blob/main/Monitoring-Observability/README.md#monitor-hub) for centralized log and metric monitoring, and extend activity history of the data platform.
142+
- **Event Hub Integration**: Use Event Hub to capture and analyze events for real-time monitoring. For example, leverage it for [automating pipeline execution with Activator](./Monitoring-Observability/FabricActivatorRulePipeline/)
143143
- **Alerting**: Configure alerts for critical events and thresholds to ensure timely responses to issues. For example, [Steps to Configure Capacity Alerts]()
144144

145145
<div align="center">

0 commit comments

Comments
 (0)