|
| 1 | +--- |
| 2 | +title: Install a Private package |
| 3 | +description: This article provides step-by-step instructions on how to install a private package in a Managed Airflow environment. |
| 4 | +author: nabhishek |
| 5 | +ms.author: abnarain |
| 6 | +ms.reviewer: jburchel |
| 7 | +ms.service: data-factory |
| 8 | +ms.topic: how-to |
| 9 | +ms.date: 09/23/2023 |
| 10 | +--- |
| 11 | + |
| 12 | +# Install a Private package |
| 13 | + |
| 14 | +[!INCLUDE[appliesto-adf-xxx-md](includes/appliesto-adf-xxx-md.md)] |
| 15 | + |
| 16 | +A python package is a way to organize related Python modules into a single directory hierarchy. A package is typically represented as a directory that contains a special file called `__init__.py`. Inside a package directory, you can have multiple Python module files (.py files) that define functions, classes, and variables. |
| 17 | +In the context of Managed Airflow, you can create packages to add your custom code. |
| 18 | + |
| 19 | +This guide provides step-by-step instructions on installing `.whl` (Wheel) file, which serve as a binary distribution format for Python package, in your Managed Airflow runtime. |
| 20 | + |
| 21 | +For illustration purpose, I create a simple custom operator as python package that can be imported as a module inside dags file. |
| 22 | + |
| 23 | +### Step 1: Develop a custom operator and a file to test it. |
| 24 | +- Create a file `sample_operator.py` |
| 25 | +```python |
| 26 | +from airflow.models.baseoperator import BaseOperator |
| 27 | + |
| 28 | + |
| 29 | +class SampleOperator(BaseOperator): |
| 30 | + def __init__(self, name: str, **kwargs) -> None: |
| 31 | + super().__init__(**kwargs) |
| 32 | + self.name = name |
| 33 | + |
| 34 | + def execute(self, context): |
| 35 | + message = f"Hello {self.name}" |
| 36 | + return message |
| 37 | +``` |
| 38 | + |
| 39 | +- To create Python package for this file, Refer to the guide: [Creating a package in python](https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/modules_management.html#creating-a-package-in-python) |
| 40 | + |
| 41 | +- Create a dag file, `sample_dag.py` to test your operator defined in Step 1. |
| 42 | +```python |
| 43 | +from datetime import datetime |
| 44 | +from airflow import DAG |
| 45 | + |
| 46 | +from airflow_operator.sample_operator import SampleOperator |
| 47 | + |
| 48 | + |
| 49 | +with DAG( |
| 50 | + "test-custom-package", |
| 51 | + tags=["example"] |
| 52 | + description="A simple tutorial DAG", |
| 53 | + schedule_interval=None, |
| 54 | + start_date=datetime(2021, 1, 1), |
| 55 | +) as dag: |
| 56 | + task = SampleOperator(task_id="sample-task", name="foo_bar") |
| 57 | + |
| 58 | + task |
| 59 | +``` |
| 60 | + |
| 61 | +### Step 2: Create a storage container. |
| 62 | + |
| 63 | +Use the steps described in [Manage blob containers using the Azure portal](/azure/storage/blobs/blob-containers-portal) to create a storage account to upload dag and your package file. |
| 64 | + |
| 65 | +### Step 3: Upload the private package into your storage account. |
| 66 | + |
| 67 | +1. Navigate to the designated container where you intend to store your Airflow DAGs and Plugins files. |
| 68 | +1. Upload your private package file to the container. Common file formats include `zip`, `.whl`, or `tar.gz`. Place the file within either the 'Dags' or 'Plugins' folder, as appropriate. |
| 69 | + |
| 70 | +### Step 4: Add your private package as a requirement. |
| 71 | + |
| 72 | +Add your private package as a requirement in the requirements.txt file. Add this file if it doesn't already exist. For the Git-sync, you need to add all the requirements in the UI itself. |
| 73 | + |
| 74 | +- **Blob Storage -** |
| 75 | +Be sure to prepend the prefix "**/opt/airflow/**" to the package path. For instance, if your private package resides at "**/dags/test/private.whl**", your requirements.txt file should feature the requirement "**/opt/airflow/dags/test/private.whl**". |
| 76 | + |
| 77 | +- **Git Sync -** |
| 78 | +For all the Git services, prepend the "**/opt/airflow/git/`<repoName>`.git/**" to the package path. For example, if your private package is in "**/dags/test/private.whl**" in a GitHub repo, then you should add the requirement "**/opt/airflow/git/`<repoName>`.git/dags/test/private.whl**" to the Airflow environment. |
| 79 | + |
| 80 | +- **ADO -** |
| 81 | +For the ADO, prepend the "**/opt/airflow/git/`<repoName>`/**" to the package path. |
| 82 | + |
| 83 | +### Step 5: Import your folder to an Airflow integrated runtime (IR) environment. |
| 84 | + |
| 85 | +When performing the import of your folder into an Airflow IR environment, ensure that you check the import requirements checkbox to load your requirements inside your airflow env. |
| 86 | + |
| 87 | +:::image type="content" source="media/airflow-create-private-requirement-package/import-requirements-checkbox.png" alt-text="Screenshot showing the import dialog for an Airflow integrated runtime environment, with the Import requirements checkbox checked."::: |
| 88 | + |
| 89 | +:::image type="content" source="media/airflow-create-private-requirement-package/import-requirements-airflow-environment.png" alt-text="Screenshot showing the imported requirements dialog in an Airflow integrated runtime environment." lightbox="media/airflow-create-private-requirement-package/import-requirements-airflow-environment.png"::: |
| 90 | + |
| 91 | +### Step 6: Inside Airflow UI, you can run your dag file created at step 1, to check if import is successful. |
| 92 | + |
| 93 | + |
| 94 | +## Next steps |
| 95 | + |
| 96 | +- [What is Azure Data Factory Managed Airflow?](concept-managed-airflow.md) |
| 97 | +- [Run an existing pipeline with Airflow](tutorial-run-existing-pipeline-with-airflow.md) |
0 commit comments