|
| 1 | +--- |
| 2 | +title: How does Managed Airflow work? |
| 3 | +titleSuffix: Azure Data Factory |
| 4 | +description: This article explains how to create a Managed Airflow instance and use DAG to make it work. |
| 5 | +ms.service: data-factory |
| 6 | +ms.topic: conceptual |
| 7 | +author: nabhishek |
| 8 | +ms.author: abnarain |
| 9 | +ms.date: 01/20/2023 |
| 10 | +--- |
| 11 | + |
| 12 | +# How does Azure Data Factory Managed Airflow work? |
| 13 | + |
| 14 | +[!INCLUDE[appliesto-adf-xxx-md](includes/appliesto-adf-xxx-md.md)] |
| 15 | + |
| 16 | +> [!NOTE] |
| 17 | +> Managed Airflow for Azure Data Factory relies on the open source Apache Airflow application. Documentation and more tutorials for Airflow can be found on the Apache Airflow [Documentation](https://airflow.apache.org/docs/) or [Community](https://airflow.apache.org/community/) pages. |
| 18 | +
|
| 19 | +Azure Data Factory Managed Airflow orchestrates your workflows using Directed Acyclic Graphs (DAGs) written in Python. You must provide your DAGs and plugins in Azure Blob Storage. Airflow requirements or library dependencies can be installed during the creation of the new Managed Airflow environment or by editing an existing Managed Airflow environment. Then run and monitor your DAGs by launching the Airflow UI from ADF using a command line interface (CLI) or a software development kit (SDK). |
| 20 | + |
| 21 | +## Create a Managed Airflow environment |
| 22 | +The following steps setup and configure your Managed Airflow environment. |
| 23 | + |
| 24 | +### Prerequisites |
| 25 | +**Azure subscription**: If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/) before you begin. |
| 26 | + Create or select an existing Data Factory in the region where the managed airflow preview is supported. |
| 27 | + |
| 28 | +### Steps to create the environment |
| 29 | +1. Create new Managed Airflow environment. |
| 30 | +Go to **Manage** hub -> **Airflow (Preview)** -> **+New** to create a new Airflow environment |
| 31 | + |
| 32 | + :::image type="content" source="media/how-does-managed-airflow-work/create-new-airflow.png" alt-text="Screenshot that shows how to create a new Managed Apache Airflow environment."::: |
| 33 | + |
| 34 | +1. Provide the details (Airflow config) |
| 35 | + |
| 36 | + :::image type="content" source="media/how-does-managed-airflow-work/airflow-environment-details.png" alt-text="Screenshot that shows some Managed Airflow environment details."::: |
| 37 | + |
| 38 | + > [!IMPORTANT] |
| 39 | + > When using **Basic** authentication, remember the username and password specified in this screen. It will be needed to login later in the Managed Airflow UI. The default option is **AAD** and it does not require creating username/ password for your Airflow environment, but instead uses the logged in user**s credential to Azure Data Factory to login/ monitor DAGs. |
| 40 | +1. **Environment variables** a simple key value store within Airflow to store and retrieve arbitrary content or settings. |
| 41 | +1. **Requirements** can be used to pre-install python libraries. You can update these later as well. |
| 42 | + |
| 43 | +## Import DAGs |
| 44 | + |
| 45 | +The following steps describe how to import DAGs into Managed Airflow. |
| 46 | + |
| 47 | +### Prerequisite |
| 48 | + |
| 49 | +You will need to upload a sample DAG onto an accessible Storage account. |
| 50 | + |
| 51 | +> [!NOTE] |
| 52 | +> Blob Storage behind VNet are not supported during the preview. We will be adding the support shortly. |
| 53 | +
|
| 54 | +[Sample Apache Airflow v2.x DAG](https://airflow.apache.org/docs/apache-airflow/stable/tutorial/fundamentals.html). |
| 55 | +[Sample Apache Airflow v1.10 DAG](https://airflow.apache.org/docs/apache-airflow/1.10.11/_modules/airflow/example_dags/tutorial.html). |
| 56 | + |
| 57 | +1. Copy-paste the content (either v2.x or v1.10 based on the Airflow environment that you have setup) into a new file called as **tutorial.py**. |
| 58 | + |
| 59 | + Upload the **tutorial.py** to a blob storage. ([How to upload a file into blob](/storage/blobs/storage-quickstart-blobs-portal.md)) |
| 60 | + |
| 61 | + > [!NOTE] |
| 62 | + > You will need to select a directory path from a blob storage account that contains folders named **dags** and **plugins** to import those into the Airflow environment. **Plugins** are not mandatory. You can also have a container named **dags** and upload all Airflow files within it. |
| 63 | +
|
| 64 | +1. Click on **Airflow (Preview)** under **Manage** hub. Then hover over the earlier created **Airflow** environment and click on **Import files** to Import all DAGs and dependencies into the Airflow Environment. |
| 65 | + |
| 66 | + :::image type="content" source="media/how-does-managed-airflow-work/import-files.png" alt-text="Screenshot shows import files in manage hub."::: |
| 67 | + |
| 68 | +1. Create a new Linked Service to the accessible storage account mentioned in the prerequisite (or use an existing one if you already have your own DAGs). |
| 69 | + |
| 70 | + :::image type="content" source="media/how-does-managed-airflow-work/create-new-linked-service.png" alt-text="Screenshot that shows how to create a new linked service."::: |
| 71 | + |
| 72 | +1. Use the storage account where you uploaded the DAG (check prerequisite). Test connection, then click **Create**. |
| 73 | + |
| 74 | + :::image type="content" source="media/how-does-managed-airflow-work/linked-service-details.png" alt-text="Screenshot shows some linked service details."::: |
| 75 | + |
| 76 | +1. Browse and select **airflow** if using the sample SAS URL or select the folder that contains **dags** folder with DAG files. |
| 77 | + |
| 78 | + > [!NOTE] |
| 79 | + > You can import DAGs and their dependencies through this interface. You will need to select a directory path from a blob storage account that contains folders named **dags** and **plugins** to import those into the Airflow environment. **Plugins** are not mandatory. |
| 80 | +
|
| 81 | + :::image type="content" source="media/how-does-managed-airflow-work/browse-storage.png" alt-text="Screenshot shows browse storage in import files."::: |
| 82 | + |
| 83 | + :::image type="content" source="media/how-does-managed-airflow-work/browse.png" alt-text="Screenshot that shows browse in airflow."::: |
| 84 | + |
| 85 | + :::image type="content" source="media/how-does-managed-airflow-work/import-in-import-files.png" alt-text="Screenshot shows import in import files."::: |
| 86 | + |
| 87 | + :::image type="content" source="media/how-does-managed-airflow-work/import-dags.png" alt-text="Screenshot shows import dags."::: |
| 88 | + |
| 89 | +> [!NOTE] |
| 90 | +> Importing DAGs could take a couple of minutes during **Preview**. The notification center (bell icon in ADF UI) can be used to track the import status updates. |
| 91 | +
|
| 92 | +## Troubleshooting import DAG issues |
| 93 | + |
| 94 | +* Problem: DAG import is taking over 5 minutes |
| 95 | +Mitigation: Reduce the size of the imported DAGs with a single import. One way to achieve this is by creating multiple DAG folders with lesser DAGs across multiple containers. |
| 96 | + |
| 97 | +* Problem: Imported DAGs do not show up when you login into the Airflow UI. |
| 98 | +Mitigation: Login into the Airflow UI and see if there are any DAG parsing errors. This could happen if the DAG files contains any incompatible code. You will find the exact line numbers and the files which have the issue through the Airflow UI. |
| 99 | + |
| 100 | + :::image type="content" source="media/how-does-managed-airflow-work/import-dag-issues.png" alt-text="Screenshot shows import dag issues."::: |
| 101 | + |
| 102 | + |
| 103 | +## Monitor DAG runs |
| 104 | + |
| 105 | +To monitor the Airflow DAGs, login into Airflow UI with the earlier created username and password. |
| 106 | + |
| 107 | +1. Click on the Airflow environment created. |
| 108 | + |
| 109 | + :::image type="content" source="media/how-does-managed-airflow-work/airflow-environment-monitor-dag.png" alt-text="Screenshot that shows the Airflow environment created."::: |
| 110 | + |
| 111 | +1. Login using the username-password provided during the Airflow Integration Runtime creation. ([You can reset the username or password by editing the Airflow Integration runtime]() if needed) |
| 112 | + |
| 113 | + :::image type="content" source="media/how-does-managed-airflow-work/login-in-dags.png" alt-text="Screenshot that shows login using the username-password provided during the Airflow Integration Runtime creation."::: |
| 114 | + |
| 115 | +## Remove DAGs from the Airflow environment |
| 116 | + |
| 117 | +If you are using Airflow version 1.x, delete DAGs that are deployed on any Airflow environment (IR), you need to delete the DAGs in two different places. |
| 118 | + |
| 119 | +1. Delete the DAG from Airflow UI |
| 120 | +1. Delete the DAG in ADF UI |
| 121 | + |
| 122 | +> [!NOTE] |
| 123 | +> This is the current experience during the Public Preview, and we will be improving this experience. |
| 124 | +
|
| 125 | +## Next steps |
| 126 | + |
| 127 | +* [Run an existing pipeline with Managed Airflow](tutorial-run-existing-pipeline-with-airflow.md) |
| 128 | +* [Refresh a Power BI dataset with Managed Airflow](tutorial-refresh-power-bi-dataset-with-airflow.md) |
| 129 | +* [Managed Airflow pricing](airflow-pricing.md) |
| 130 | +* [How to change the password for Managed Airflow environments](password-change-airflow.md) |
0 commit comments