You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/ci-cd-pattern-with-airflow.md
+18-16Lines changed: 18 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,10 +33,10 @@ Continuous Deployment (CD) is an extension of CI that takes the automation one s
33
33
**CI Pipeline with Dev IR:**
34
34
35
35
When a pull request (PR) is made from a feature branch to the Dev branch, it triggers a PR pipeline. This pipeline is designed to efficiently perform quality checks on your feature branches, ensuring code integrity and reliability. The following types of checks can be included in the pipeline:
36
-
1.**Python Dependencies Testing**: These tests install and verify the correctness of Python dependencies to ensure that the project's dependencies are properly configured.
37
-
2.**Code Analysis and Linting:** Tools for static code analysis and linting are applied to evaluate code quality and adherence to coding standards.
38
-
3.**Airflow DAG’s Tests:** These tests execute validation tests, including tests for the DAG definition and unit tests designed for Airflow DAGs.
39
-
4.**Unit Tests for Airflow custom operators, hooks, sensors and triggers**
36
+
-**Python Dependencies Testing**: These tests install and verify the correctness of Python dependencies to ensure that the project's dependencies are properly configured.
37
+
-**Code Analysis and Linting:** Tools for static code analysis and linting are applied to evaluate code quality and adherence to coding standards.
38
+
-**Airflow DAG’s Tests:** These tests execute validation tests, including tests for the DAG definition and unit tests designed for Airflow DAGs.
39
+
-**Unit Tests for Airflow custom operators, hooks, sensors and triggers**
40
40
If any of these checks fail, the pipeline terminates, signaling that the developer needs to address the issues identified.
41
41
42
42
#### Git-sync with Prod IR: Map your Managed Airflow environment with your Git repository’s Production branch.
@@ -50,9 +50,9 @@ Once the Feature branch successfully merges with the Dev branch, you can create
50
50
51
51
It allows you to continuously deploy the DAGs/ code into Managed Airflow environment.
52
52
53
-
1.**Fail-fast approach:** Without the integration of CI/CD process, the first time you know DAG contains errors is likely when it's pushed to GitHub, synchronized with managed airflow and throws an Import Error. Meanwhile the other developer can unknowingly pull the faulty code from the repository, potentially leading to inefficiencies down the line.
53
+
-**Fail-fast approach:** Without the integration of CI/CD process, the first time you know DAG contains errors is likely when it's pushed to GitHub, synchronized with managed airflow and throws an Import Error. Meanwhile the other developer can unknowingly pull the faulty code from the repository, potentially leading to inefficiencies down the line.
54
54
55
-
2.**Code quality improvement:** Neglecting fundamental checks like syntax verification, necessary imports, and checks for other best coding practices, can increase the likelihood of delivering subpar code.
55
+
-**Code quality improvement:** Neglecting fundamental checks like syntax verification, necessary imports, and checks for other best coding practices, can increase the likelihood of delivering subpar code.
56
56
57
57
## Deployment Patterns in Azure Managed Airflow:
58
58
@@ -66,13 +66,13 @@ It allows you to continuously deploy the DAGs/ code into Managed Airflow environ
66
66
67
67
### Advantages:
68
68
69
-
1.**No Local Development Environment Required:** Managed Airflow handles the underlying infrastructure, updates, and maintenance, reducing the operational overhead of managing Airflow clusters. The service allows you to focus on building and managing workflows rather than managing infrastructure.
69
+
-**No Local Development Environment Required:** Managed Airflow handles the underlying infrastructure, updates, and maintenance, reducing the operational overhead of managing Airflow clusters. The service allows you to focus on building and managing workflows rather than managing infrastructure.
70
70
71
-
2.**Scalability:** Managed Airflow provides auto scaling capability to scale resources as needed, ensuring that your data pipelines can handle increasing workloads or bursts of activity without manual intervention.
71
+
-**Scalability:** Managed Airflow provides auto scaling capability to scale resources as needed, ensuring that your data pipelines can handle increasing workloads or bursts of activity without manual intervention.
72
72
73
-
3.**Monitoring and Logging:** Managed Airflow includes Diagnostic logs and monitoring, making it easier to track the execution of your workflows, diagnose issues, and optimize performance.
73
+
-**Monitoring and Logging:** Managed Airflow includes Diagnostic logs and monitoring, making it easier to track the execution of your workflows, diagnose issues, and optimize performance.
74
74
75
-
4.**Git Integration**: Managed Airflow supports Git-sync feature, allowing you to store your DAGs in Git repository, making it easier to manage changes and collaborate with the team.
75
+
-**Git Integration**: Managed Airflow supports Git-sync feature, allowing you to store your DAGs in Git repository, making it easier to manage changes and collaborate with the team.
76
76
77
77
### Workflow:
78
78
@@ -114,7 +114,7 @@ Synchronize your GitHub repository’s branch with Azure Managed Airflow Service
114
114
115
115
Learn more about how to use Azure Managed Airflow's [Git-sync feature](airflow-sync-github-repository.md).
116
116
117
-
3.**Utilize Managed Airflow Service as Production environment:**
117
+
-**Utilize Managed Airflow Service as Production environment:**
118
118
119
119
You can raise a Pull Request (PR) to the branch that is sync with the Managed Airflow Service after successfully developing and testing data pipelines on local development setup. Once the branch is merged you can utilize the Managed Airflow service's features like auto-scaling and monitoring and logging at production level.
120
120
@@ -197,14 +197,15 @@ jobs:
197
197
198
198
**Step 5:** In the tests folder, create the tests for Airflow DAGs. Following are the few examples:
199
199
200
-
1. At the least, it's crucial to conduct initial testing using `import_errors` to ensure the DAG's integrity and correctness.
200
+
* At the least, it's crucial to conduct initial testing using `import_errors` to ensure the DAG's integrity and correctness.
201
201
This test ensures:
202
202
203
203
-**Your DAG does not contain cyclicity:** Cyclicity, where a task forms a loop or circular dependency within the workflow, can lead to unexpected and infinite execution loops.
204
204
205
205
-**There are no import errors:** Import errors can arise due to issues like missing dependencies, incorrect module paths, or coding errors.
206
206
207
207
-**Tasks are defined correctly:** Confirm that the tasks within your DAG are correctly defined.
Copy file name to clipboardExpand all lines: articles/data-factory/concept-managed-airflow.md
+2-6Lines changed: 2 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,9 +66,7 @@ Managed Airflow in Azure Data Factory offers a range of powerful features, inclu
66
66
67
67
## Supported Apache Airflow versions
68
68
69
-
* 1.10.14
70
-
* 2.2.2
71
-
* 2.4.3
69
+
* 2.6.3
72
70
73
71
> [!NOTE]
74
72
> Changing the Airflow version within an existing IR is not supported. Instead, the recommended solution is to create a new Airflow IR with the desired version
@@ -84,11 +82,9 @@ You can install any provider package by editing the airflow environment from the
84
82
## Limitations
85
83
86
84
* Managed Airflow in other regions is available by GA.
87
-
* Data Sources connecting through airflow should be publicly accessible.
88
-
* Blob Storage behind VNet is not supported during the public preview.
85
+
* Data Sources connecting through airflow should be accessible through public endpoint (network).
89
86
* DAGs that are inside a Blob Storage in VNet/behind Firewall is currently not supported.
90
87
* Azure Key Vault isn't supported in LinkedServices to import dags.
91
-
* Airflow supports officially Blob Storage and ADLS with some limitations.
0 commit comments