Skip to content

Commit 5d376f2

Browse files
authored
Merge pull request #255874 from ambika-garg/update-concepts
Update limitations of Managed Airflow | Update supported version of Apache Airflow by Managed Airflow
2 parents 9792287 + 5f5b22f commit 5d376f2

File tree

2 files changed

+20
-22
lines changed

2 files changed

+20
-22
lines changed

articles/data-factory/ci-cd-pattern-with-airflow.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,10 @@ Continuous Deployment (CD) is an extension of CI that takes the automation one s
3333
**CI Pipeline with Dev IR:**
3434

3535
When a pull request (PR) is made from a feature branch to the Dev branch, it triggers a PR pipeline. This pipeline is designed to efficiently perform quality checks on your feature branches, ensuring code integrity and reliability. The following types of checks can be included in the pipeline: 
36-
1. **Python Dependencies Testing**: These tests install and verify the correctness of Python dependencies to ensure that the project's dependencies are properly configured. 
37-
2. **Code Analysis and Linting:** Tools for static code analysis and linting are applied to evaluate code quality and adherence to coding standards. 
38-
3. **Airflow DAG’s Tests:** These tests execute validation tests, including tests for the DAG definition and unit tests designed for Airflow DAGs. 
39-
4. **Unit Tests for Airflow custom operators, hooks, sensors and triggers**  
36+
- **Python Dependencies Testing**: These tests install and verify the correctness of Python dependencies to ensure that the project's dependencies are properly configured. 
37+
- **Code Analysis and Linting:** Tools for static code analysis and linting are applied to evaluate code quality and adherence to coding standards. 
38+
- **Airflow DAG’s Tests:** These tests execute validation tests, including tests for the DAG definition and unit tests designed for Airflow DAGs. 
39+
- **Unit Tests for Airflow custom operators, hooks, sensors and triggers**  
4040
If any of these checks fail, the pipeline terminates, signaling that the developer needs to address the issues identified. 
4141

4242
#### Git-sync with Prod IR: Map your Managed Airflow environment with your Git repository’s Production branch. 
@@ -50,9 +50,9 @@ Once the Feature branch successfully merges with the Dev branch, you can create
5050

5151
It allows you to continuously deploy the DAGs/ code into Managed Airflow environment.  
5252

53-
1. **Fail-fast approach:** Without the integration of CI/CD process, the first time you know DAG contains errors is likely when it's pushed to GitHub, synchronized with managed airflow and throws an Import Error. Meanwhile the other developer can unknowingly pull the faulty code from the repository, potentially leading to inefficiencies down the line. 
53+
- **Fail-fast approach:** Without the integration of CI/CD process, the first time you know DAG contains errors is likely when it's pushed to GitHub, synchronized with managed airflow and throws an Import Error. Meanwhile the other developer can unknowingly pull the faulty code from the repository, potentially leading to inefficiencies down the line. 
5454

55-
2. **Code quality improvement:** Neglecting fundamental checks like syntax verification, necessary imports, and checks for other best coding practices, can increase the likelihood of delivering subpar code. 
55+
- **Code quality improvement:** Neglecting fundamental checks like syntax verification, necessary imports, and checks for other best coding practices, can increase the likelihood of delivering subpar code. 
5656

5757
## Deployment Patterns in Azure Managed Airflow: 
5858

@@ -66,13 +66,13 @@ It allows you to continuously deploy the DAGs/ code into Managed Airflow environ
6666

6767
### Advantages: 
6868

69-
1. **No Local Development Environment Required:** Managed Airflow handles the underlying infrastructure, updates, and maintenance, reducing the operational overhead of managing Airflow clusters. The service allows you to focus on building and managing workflows rather than managing infrastructure. 
69+
- **No Local Development Environment Required:** Managed Airflow handles the underlying infrastructure, updates, and maintenance, reducing the operational overhead of managing Airflow clusters. The service allows you to focus on building and managing workflows rather than managing infrastructure. 
7070

71-
2. **Scalability:** Managed Airflow provides auto scaling capability to scale resources as needed, ensuring that your data pipelines can handle increasing workloads or bursts of activity without manual intervention. 
71+
- **Scalability:** Managed Airflow provides auto scaling capability to scale resources as needed, ensuring that your data pipelines can handle increasing workloads or bursts of activity without manual intervention. 
7272

73-
3. **Monitoring and Logging:** Managed Airflow includes Diagnostic logs and monitoring, making it easier to track the execution of your workflows, diagnose issues, and optimize performance. 
73+
- **Monitoring and Logging:** Managed Airflow includes Diagnostic logs and monitoring, making it easier to track the execution of your workflows, diagnose issues, and optimize performance. 
7474

75-
4. **Git Integration**: Managed Airflow supports Git-sync feature, allowing you to store your DAGs in Git repository, making it easier to manage changes and collaborate with the team.  
75+
- **Git Integration**: Managed Airflow supports Git-sync feature, allowing you to store your DAGs in Git repository, making it easier to manage changes and collaborate with the team.  
7676

7777
### Workflow: 
7878

@@ -114,7 +114,7 @@ Synchronize your GitHub repository’s branch with Azure Managed Airflow Service
114114

115115
Learn more about how to use Azure Managed Airflow's [Git-sync feature](airflow-sync-github-repository.md).
116116

117-
3. **Utilize Managed Airflow Service as Production environment:** 
117+
- **Utilize Managed Airflow Service as Production environment:** 
118118

119119
You can raise a Pull Request (PR) to the branch that is sync with the Managed Airflow Service after successfully developing and testing data pipelines on local development setup. Once the branch is merged you can utilize the Managed Airflow service's features like auto-scaling and monitoring and logging at production level. 
120120

@@ -197,14 +197,15 @@ jobs:
197197

198198
**Step 5:** In the tests folder, create the tests for Airflow DAGs. Following are the few examples: 
199199

200-
1. At the least, it's crucial to conduct initial testing using `import_errors` to ensure the DAG's integrity and correctness. 
200+
* At the least, it's crucial to conduct initial testing using `import_errors` to ensure the DAG's integrity and correctness. 
201201
This test ensures: 
202202

203203
- **Your DAG does not contain cyclicity:** Cyclicity, where a task forms a loop or circular dependency within  the workflow, can lead to unexpected and infinite execution loops. 
204204

205205
- **There are no import errors:** Import errors can arise due to issues like missing dependencies, incorrect module paths, or coding errors.  
206206

207207
- **Tasks are defined correctly:** Confirm that the tasks within your DAG are correctly defined.
208+
208209
```python
209210
@pytest.fixture()
210211

@@ -217,9 +218,9 @@ def test_no_import_errors(dagbag):
217218
    """
218219
    assert not dagbag.import_errors
219220
```
220-
 
221221

222-
1. Test to ensure specific Dag IDs to be present in your feature branch before merging it into the development (dev) branch. 
222+
* Test to ensure specific Dag IDs to be present in your feature branch before merging it into the development (dev) branch. 
223+
223224
```python
224225
def test_expected_dags(dagbag):
225226
    """
@@ -234,7 +235,8 @@ def test_expected_dags(dagbag):
234235
        assert dag_id == dag.dag_id
235236
```
236237

237-
2. Test to ensure only approved tags are associated with your DAGs. This test helps to enforce the approved tag usage. 
238+
* Test to ensure only approved tags are associated with your DAGs. This test helps to enforce the approved tag usage. 
239+
238240
```python
239241
def test_requires_approved_tag(dagbag):
240242
    """
@@ -252,7 +254,7 @@ def test_requires_approved_tag(dagbag):
252254

253255
**Step 6:** Now, when you raise pull request to dev branch, GitHub Actions triggers the CI pipeline, to run all the tests. 
254256

255-
#### For information
257+
#### For More Information
256258

257259
- [https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/models/dagbag.html](https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/models/dagbag.html) 
258260

articles/data-factory/concept-managed-airflow.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,7 @@ Managed Airflow in Azure Data Factory offers a range of powerful features, inclu
6666
6767
## Supported Apache Airflow versions
6868

69-
* 1.10.14
70-
* 2.2.2
71-
* 2.4.3
69+
* 2.6.3
7270

7371
> [!NOTE]
7472
> Changing the Airflow version within an existing IR is not supported. Instead, the recommended solution is to create a new Airflow IR with the desired version
@@ -84,11 +82,9 @@ You can install any provider package by editing the airflow environment from the
8482
## Limitations
8583

8684
* Managed Airflow in other regions is available by GA.
87-
* Data Sources connecting through airflow should be publicly accessible.
88-
* Blob Storage behind VNet is not supported during the public preview.
85+
* Data Sources connecting through airflow should be accessible through public endpoint (network).
8986
* DAGs that are inside a Blob Storage in VNet/behind Firewall is currently not supported.
9087
* Azure Key Vault isn't supported in LinkedServices to import dags.
91-
* Airflow supports officially Blob Storage and ADLS with some limitations.
9288

9389
## Next steps
9490

0 commit comments

Comments
 (0)