|
| 1 | +--- |
| 2 | +title: Enable Azure Key Vault for airflow |
| 3 | +titleSuffix: Azure Data Factory |
| 4 | +description: This article explains how to enable Azure Key Vault as the secret backend for a Managed Airflow instance. |
| 5 | +ms.service: data-factory |
| 6 | +ms.topic: how-to |
| 7 | +author: nabhishek |
| 8 | +ms.author: abnarain |
| 9 | +ms.date: 08/29/2023 |
| 10 | +--- |
| 11 | + |
| 12 | +# Enable Azure Key Vault for Managed Airflow |
| 13 | + |
| 14 | +[!INCLUDE[appliesto-adf-xxx-md](includes/appliesto-adf-xxx-md.md)] |
| 15 | + |
| 16 | +> [!NOTE] |
| 17 | +> Managed Airflow for Azure Data Factory relies on the open source Apache Airflow application. Documentation and more tutorials for Airflow can be found on the Apache Airflow [Documentation](https://airflow.apache.org/docs/) or [Community](https://airflow.apache.org/community/) pages. |
| 18 | +
|
| 19 | +Apache Airflow provides a range of backends for storing sensitive information like variables and connections, including Azure Key Vault. This guide shows you how to configure Azure Key Vault as the secret backend for Apache Airflow, enabling you to store and manage your sensitive information in a secure and centralized manner. |
| 20 | + |
| 21 | +## Prerequisites |
| 22 | + |
| 23 | +- **Azure subscription** - If you don't have an Azure subscription, create a [free Azure account](https://azure.microsoft.com/free/) before you begin. |
| 24 | +- **Azure storage account** - If you don't have a storage account, see [Create an Azure storage account](/azure/storage/common/storage-account-create?tabs=azure-portal) for steps to create one. Ensure the storage account allows access only from selected networks. |
| 25 | +- **Azure Data Factory pipeline** - You can follow any of the tutorials and create a new data factory pipeline in case you don't already have one or create one with one select in [Get started and try out your first data factory pipeline](quickstart-get-started.md). |
| 26 | +- **Azure Key Vault** - You can follow [this tutorial to create a new Azure Key Vault](/azure/key-vault/general/quick-create-portal) if you don’t have one. |
| 27 | +- **Service Principal** - You'll need to [create a new service principal](/azure/active-directory/develop/howto-create-service-principal-portal) or use an existing one and grant it permission to access Azure Key Vault (example - grant the **key-vault-contributor role** to the SPN for the key vault, so the SPN can manage it). Additionally, you'll need to get the service principal **Client ID** and **Client Secret** (API Key) to add them as environment variables, as described later in this article. |
| 28 | + |
| 29 | +## Permissions |
| 30 | + |
| 31 | +Assign your SPN the following roles in your key vault from the [Built-in roles](/azure/role-based-access-control/built-in-roles). |
| 32 | + |
| 33 | +- Key Vault Contributor |
| 34 | +- Key Vault Secrets User |
| 35 | + |
| 36 | +## Enable the Azure Key Vault backend for a Managed Airflow instance |
| 37 | + |
| 38 | +Follow these steps to enable the Azure Key Vault as the secret backend for your Managed Airflow instance. |
| 39 | + |
| 40 | +1. Navigate to the [Managed Airflow instance's integrated runtime (IR) environment](how-does-managed-airflow-work.md). |
| 41 | +1. Install the [**apache-airflow-providers-microsoft-azure**](https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/index.html) for the **Airflow requirements** during your initial Airflow environment setup. |
| 42 | + |
| 43 | + :::image type="content" source="media/enable-azure-key-vault-for-managed-airflow/airflow-environment-setup.png" alt-text="Screenshot showing the Airflow Environment Setup window highlighting the Airflow requirements." lightbox="media/enable-azure-key-vault-for-managed-airflow/airflow-environment-setup.png"::: |
| 44 | + |
| 45 | +1. Add the following settings for the **Airflow configuration overrides** in integrated runtime properties: |
| 46 | + |
| 47 | + - **AIRFLOW__SECRETS__BACKEND**: "airflow.providers.microsoft.azure.secrets.key_vault.AzureKeyVaultBackend" |
| 48 | + - **AIRFLOW__SECRETS__BACKEND_KWARGS**: "{"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "vault_url": **\<your keyvault uri\>**}” |
| 49 | + |
| 50 | + :::image type="content" source="media/enable-azure-key-vault-for-managed-airflow/airflow-configuration-overrides.png" alt-text="Screenshot showing the configuration of the Airflow configuration overrides setting in the Airflow environment setup." lightbox="media/enable-azure-key-vault-for-managed-airflow/airflow-configuration-overrides.png"::: |
| 51 | + |
| 52 | +1. Add the following for the **Environment variables** configuration in the Airflow integrated runtime properties: |
| 53 | + |
| 54 | + - **AZURE_CLIENT_ID** = \<Client ID of SPN\> |
| 55 | + - **AZURE_TENANT_ID** = \<Tenant Id\> |
| 56 | + - **AZURE_CLIENT_SECRET** = \<Client Secret of SPN\> |
| 57 | + |
| 58 | + :::image type="content" source="media/enable-azure-key-vault-for-managed-airflow/environment-variables.png" alt-text="Screenshot showing the Environment variables section of the Airflow integrated runtime properties." lightbox="media/enable-azure-key-vault-for-managed-airflow/environment-variables.png"::: |
| 59 | + |
| 60 | +1. Then you can use variables and connections and they will automatically be stored in Azure Key Vault. The name of connections and variables need to follow AIRFLOW__SECRETS__BACKEND_KWARGS as defined previously. For more information, refer to [Azure-key-vault as secret backend](https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/secrets-backends/azure-key-vault.html). |
| 61 | + |
| 62 | +## Sample DAG using Azure Key Vault as the backend |
| 63 | + |
| 64 | +1. Create a new Python file **adf.py** with the following contents: |
| 65 | + |
| 66 | + ```python |
| 67 | + from datetime import datetime, timedelta |
| 68 | + from airflow.operators.python_operator import PythonOperator |
| 69 | + from textwrap import dedent |
| 70 | + from airflow.models import Variable |
| 71 | + from airflow import DAG |
| 72 | + import logging |
| 73 | + |
| 74 | + def retrieve_variable_from_akv(): |
| 75 | + variable_value = Variable.get("sample-variable") |
| 76 | + logger = logging.getLogger(__name__) |
| 77 | + logger.info(variable_value) |
| 78 | + |
| 79 | + with DAG( |
| 80 | + "tutorial", |
| 81 | + default_args={ |
| 82 | + "depends_on_past": False, |
| 83 | + |
| 84 | + "email_on_failure": False, |
| 85 | + "email_on_retry": False, |
| 86 | + "retries": 1, |
| 87 | + "retry_delay": timedelta(minutes=5), |
| 88 | + }, |
| 89 | + description="A simple tutorial DAG", |
| 90 | + schedule_interval=timedelta(days=1), |
| 91 | + start_date=datetime(2021, 1, 1), |
| 92 | + catchup=False, |
| 93 | + tags=["example"], |
| 94 | + ) as dag: |
| 95 | + |
| 96 | + get_variable_task = PythonOperator( |
| 97 | + task_id="get_variable", |
| 98 | + python_callable=retrieve_variable_from_akv, |
| 99 | + ) |
| 100 | + |
| 101 | + get_variable_task |
| 102 | + ``` |
| 103 | + |
| 104 | +1. Store variables for connections in Azure Key Vault. Refer to [Store credentials in Azure Key Vault](store-credentials-in-key-vault.md) |
| 105 | + |
| 106 | + :::image type="content" source="media/enable-azure-key-vault-for-managed-airflow/secrets-configuration.png" alt-text="Screenshot showing the configuration of secrets in Azure Key Vault." lightbox="media/enable-azure-key-vault-for-managed-airflow/secrets-configuration.png"::: |
| 107 | + |
| 108 | +## Next steps |
| 109 | + |
| 110 | +- [Run an existing pipeline with Managed Airflow](tutorial-run-existing-pipeline-with-airflow.md) |
| 111 | +- [Managed Airflow pricing](airflow-pricing.md) |
| 112 | +- [How to change the password for Managed Airflow environments](password-change-airflow.md) |
0 commit comments