This project demonstrates how to create an Amazon MWAA environment that uses AWS CodeArtifact for Python dependencies. This enables users to avoid providing MWAA with an internet access via NAT Gateway and hence reduce the cost of their infrastructure.
AWS Lambda runs every 10 hours to obtain the authorization token for AWS CodeArtifact, which is then used to create index-url for pip remote repository (CodeArtifact repository). Generated index-url is saved to codeartifact.txt file that is then uploaded to an Amazon S3 bucket. MWAA fetches DAGs and codeartifact.txt at the runtime, and installs Python dependencies from the CodeArtifact repository.
.
├── infra/ // AWS CDK infrastructure
├── mwaa-ca-bucket-content/ // DAGs and requirements.txt
├── lambda/ // Lambda handler
├── .env // Environment variables
├── Makefile // Make rules for automation
Before moving on with the project deployment, complete the following checks:
- Install
npmon your machine - Install
Pythonon your machine (Python 3.8 or higher) - Ensure that AWS CLI is installed and configured on your machine
- Ensure that AWS CDK is installed and configured on your machine
NOTE: ℹ️ This project uses CDK v2, which requires Node.js 18.x or later.
To create a virtual environment run the following make rule:
# from the root directory
$ make venvThis rule will create a virtual environment in infra/venv and install all the necessary dependencies.
Set environment variables in .env file.
AWS_REGION: AWS region to which you wish to deploy this projectBUCKET_NAME: choose a unique name for an Amazon S3 bucket that will contain Airflow DAGsAIRFLOW_VERSION: Apache Airflow version (recommended:2.10.3or latest supported version)
Execute deploy rule to deploy the infrastructure:
# from the root directory
$ make deployNOTE: y and press Enter.
To destroy all resources created for this project execute the destroy rule:
# from the root directory
$ make destroyNOTE: y and press Enter.
To install preferred Python dependencies to your MWAA environment, update the requirements.txt file and upload it to S3 bucket. To make these changes take effect, you will need to update your MWAA environment by selecting a new version of requirements.txt. You can do so in AWS Console or via AWS CLI.
Upload requirements.txt with new Python dependencies:
aws s3 cp mwaa-ca-bucket-content/requirements.txt s3://YOUR-BUCKET-NAME/To get requirements.txt versions run:
aws s3api list-object-versions --bucket YOUR-BUCKET-NAME --prefix requirements.txtFinally, update your MWAA environment with a new version of requirements.txt:
aws mwaa update-environment --name mwaa_codeartifact_env --requirements-s3-object-version OBJECT_VERSIONIf you build your own Python packages, you could also add this process to update requirements.txt and MWAA environment as part of your release pipeline.
This project has been updated to use current AWS tools and runtimes:
- CDK v2: Migrated from deprecated CDK v1 to CDK v2
- Python 3.12: Updated Lambda runtime from deprecated Python 3.7 to Python 3.12
- Airflow 2.10.3: Updated to latest supported Airflow version
- Current Operators: Updated Airflow DAG to use current operators instead of deprecated ones
- Modern CDK Patterns: Updated VPC configuration to use current subnet types and IP address configuration
This library is licensed under the MIT-0 License. See the LICENSE file.
