Skip to content

Commit 450ed92

Browse files
mobuchowskiclaudecswatt
authored
add MWAA upgrade doc to Data Observability Airflow docs (#34578)
* add MWAA upgrade doc to Data Observability Airflow docs Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com> * change wording Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com> * add 2.7.2 fallback Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com> * code review updates Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com> * reformat numbers * some table formatting * table tweaks --------- Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: cswatt <cecilia.watt@datadoghq.com>
1 parent 5a7777c commit 450ed92

File tree

4 files changed

+252
-7
lines changed

4 files changed

+252
-7
lines changed

content/en/data_observability/jobs_monitoring/airflow.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,8 @@ Set `OPENLINEAGE_CLIENT_LOGGING` to `DEBUG` along with the other environment var
149149
150150
### Setup
151151
152+
<div class="alert alert-info"><strong>If you are using Airflow 2.7.2, 2.8.1, or 2.9.2</strong>: MWAA default constraints pin older <code>apache-airflow-providers-openlineage` versions</code>. These versions include known issues that can degrade the Data Observability experience. To upgrade to provider versions with fixes, see <a href="/data_observability/jobs_monitoring/airflow_mwaa_upgrade/"> Upgrade OpenLineage provider on Amazon MWAA for Airflow 2.7.2, 2.8.1, and 2.9.2</a>.</div>
153+
152154
To get started, follow the instructions below.
153155
154156
1. Install `openlineage` provider by adding the following into your `requirements.txt` file:
@@ -157,7 +159,7 @@ To get started, follow the instructions below.
157159
apache-airflow-providers-openlineage
158160
```
159161
160-
2. Configure `openlineage` provider. The simplest option is to set the following environment variables in your [Amazon MWAA start script][3]:
162+
1. Configure `openlineage` provider. The simplest option is to set the following environment variables in your [Amazon MWAA start script][3]:
161163
162164
```shell
163165
#!/bin/sh
@@ -179,11 +181,11 @@ To get started, follow the instructions below.
179181

180182
Check official documentation [configuration-openlineage][4] for other supported configurations of `openlineage` provider.
181183

182-
3. Deploy your updated `requirements.txt` and [Amazon MWAA startup script][3] to your Amazon S3 folder configured for your Amazon MWAA Environment.
184+
1. Deploy your updated `requirements.txt` and [Amazon MWAA startup script][3] to your Amazon S3 folder configured for your Amazon MWAA Environment.
183185

184-
4. Optionally, set up Log Collection for correlating task logs to DAG run executions in DJM:
185-
1. Configure Amazon MWAA to [send logs to CloudWatch][8].
186-
2. [Send the logs to Datadog][9].
186+
1. Optionally, set up Log Collection for correlating task logs to DAG run executions in DJM:
187+
1. Configure Amazon MWAA to [send logs to CloudWatch][9].
188+
2. [Send the logs to Datadog][10].
187189

188190
[1]: https://github.com/apache/airflow/releases/tag/2.7.0
189191
[2]: https://airflow.apache.org/docs/apache-airflow-providers-openlineage/stable/index.html
@@ -192,8 +194,10 @@ To get started, follow the instructions below.
192194
[5]: https://docs.datadoghq.com/account_management/api-app-keys/#api-keys
193195
[6]: https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-create-role.html
194196
[7]: https://app.datadoghq.com/data-jobs/
195-
[8]: https://docs.aws.amazon.com/mwaa/latest/userguide/monitoring-airflow.html#monitoring-airflow-enable
196-
[9]: /integrations/amazon_web_services/?tab=roledelegation#log-collection
197+
[8]: https://openlineage.io/docs/integrations/airflow/
198+
[9]: https://docs.aws.amazon.com/mwaa/latest/userguide/monitoring-airflow.html#monitoring-airflow-enable
199+
[10]: /integrations/amazon_web_services/?tab=roledelegation#log-collection
200+
[11]: /data_observability/jobs_monitoring/airflow_mwaa_upgrade/
197201

198202
### Validation
199203

Lines changed: 241 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
---
2+
title: "Upgrade OpenLineage Provider on Amazon MWAA for Airflow 2.7.2, 2.8.1, or 2.9.2"
3+
description: "Resolve dependency conflicts when installing apache-airflow-providers-openlineage on Amazon MWAA Airflow 2.7.2, 2.8.1, or 2.9.2."
4+
further_reading:
5+
- link: '/data_observability/jobs_monitoring/airflow/?tab=amazonmwaa'
6+
tag: 'Documentation'
7+
text: 'Enable Data Observability: Jobs Monitoring for Apache Airflow'
8+
---
9+
10+
## Overview
11+
12+
Use this guide if your Amazon MWAA environment runs Airflow `2.7.2`, `2.8.1`, or `2.9.2`.
13+
14+
For these Airflow versions, MWAA default constraints pin older `apache-airflow-providers-openlineage` and OpenLineage package versions. These versions can cause known reliability and compatibility issues, which can degrade the Data Observability experience.
15+
16+
To use provider versions that include these fixes, update constraints and requirements together. Amazon MWAA enforces package constraints for each Airflow and Python version, so upgrading OpenLineage packages can conflict with MWAA defaults unless you provide custom constraints.
17+
18+
For details about MWAA dependency and constraints behavior, see AWS documentation on [Python dependencies][3]. To review provider compatibility and requirements, see [OpenLineage provider documentation][4].
19+
20+
For base setup steps, see [Enable Data Observability: Jobs Monitoring for Apache Airflow][1].
21+
22+
## Requirements
23+
24+
- Access to the Amazon S3 bucket configured for your MWAA environment
25+
- Permission to update `requirements.txt` and MWAA environment configuration
26+
27+
## Recommended versions
28+
29+
The following table shows the default versions pinned by MWAA constraints and the recommended upgrade versions for each Airflow version:
30+
31+
| Package | Airflow 2.7.2 | Airflow 2.8.1 | Airflow 2.9.2 |
32+
|---|---|---|---|
33+
| `apache-airflow-providers-openlineage` | Default: 1.1.0 <br/> Upgrade: **1.14.0** - [Datadog-patched wheel][6] | Default: 1.4.0 <br/> Upgrade: **1.14.0** | Default: 1.8.0 <br/> Upgrade:**2.2.0** |
34+
| `apache-airflow-providers-common-sql` | Default: 1.7.2 <br/> Upgrade: no change | Default: 1.10.0 <br/> Upgrade: **1.20.0** | Default: 1.14.0 <br/> Upgrade: **1.21.0** |
35+
| `apache-airflow-providers-common-compat` | Default: n/a <br/> Upgrade: **1.2.2** - [Datadog-patched wheel][7] | Default: n/a <br/> Upgrade: **1.2.1** | Default: n/a <br/> Upgrade: **1.4.0** |
36+
| `openlineage-integration-common` | Default: 1.3.1 <br/> Upgrade: **1.24.2** | Default: 1.7.0 <br/> Upgrade: **1.24.2** | Default: 1.16.0 <br/> Upgrade: **1.31.0** |
37+
| `openlineage-python` | Default: 1.3.1 <br/> Upgrade: **1.24.2** | Default: 1.7.0 <br/> Upgrade: **1.24.2** | Default: 1.16.0 <br/> Upgrade: **1.31.0** |
38+
| `openlineage-sql` | Default: 1.3.1 <br/> Upgrade: **1.24.2** | Default: 1.7.0 <br/> Upgrade: **1.24.2** | Default: 1.16.0 <br/> Upgrade: **1.31.0** |
39+
40+
## Update constraints and requirements
41+
42+
{{< tabs >}}
43+
{{% tab "Airflow 2.7.2" %}}
44+
45+
Airflow 2.7.2 is not compatible with the upstream `apache-airflow-providers-openlineage` 1.14.0 package. Datadog provides patched versions of the provider and `common-compat` packages with relaxed Airflow version constraints for use on MWAA 2.7.2.
46+
47+
1. Download the patched wheel files and upload them to your MWAA S3 bucket:
48+
49+
```shell
50+
MWAA_BUCKET=<MWAA_BUCKET_NAME>
51+
curl -Lo apache_airflow_providers_openlineage-1.14.0-py3-none-any.whl \
52+
"https://docs.datadoghq.com/resources/whl/apache_airflow_providers_openlineage-1.14.0-py3-none-any.whl"
53+
curl -Lo apache_airflow_providers_common_compat-1.2.2-py3-none-any.whl \
54+
"https://docs.datadoghq.com/resources/whl/apache_airflow_providers_common_compat-1.2.2-py3-none-any.whl"
55+
aws s3 cp apache_airflow_providers_openlineage-1.14.0-py3-none-any.whl "s3://${MWAA_BUCKET}/dags/"
56+
aws s3 cp apache_airflow_providers_common_compat-1.2.2-py3-none-any.whl "s3://${MWAA_BUCKET}/dags/"
57+
```
58+
59+
2. Download the Airflow constraints file:
60+
61+
```shell
62+
curl -o constraints.txt \
63+
"https://raw.githubusercontent.com/apache/airflow/constraints-2.7.2/constraints-3.11.txt"
64+
```
65+
66+
**Note**: Check the Python version for your MWAA environment. If your environment uses a different Python version, replace `3.11` in the URL.
67+
68+
3. Edit `constraints.txt` and update these package pins:
69+
70+
```text
71+
apache-airflow-providers-openlineage==1.14.0
72+
openlineage-integration-common==1.24.2
73+
openlineage-python==1.24.2
74+
openlineage-sql==1.24.2
75+
```
76+
77+
Also add the following line, since it is not present in the default constraints file:
78+
79+
```text
80+
apache-airflow-providers-common-compat==1.2.2
81+
```
82+
83+
No change is needed for `apache-airflow-providers-common-sql` (the default `1.7.2` is compatible).
84+
85+
4. Upload the updated constraints file to your MWAA S3 bucket:
86+
87+
```shell
88+
aws s3 cp constraints.txt "s3://${MWAA_BUCKET}/dags/constraints.txt"
89+
```
90+
91+
5. Update `requirements.txt` to reference the patched wheels and constraints:
92+
93+
```text
94+
--constraint /usr/local/airflow/dags/constraints.txt
95+
/usr/local/airflow/dags/apache_airflow_providers_openlineage-1.14.0-py3-none-any.whl
96+
/usr/local/airflow/dags/apache_airflow_providers_common_compat-1.2.2-py3-none-any.whl
97+
openlineage-integration-common==1.24.2
98+
openlineage-python==1.24.2
99+
openlineage-sql==1.24.2
100+
```
101+
102+
6. Upload the updated `requirements.txt` file:
103+
104+
```shell
105+
aws s3 cp requirements.txt "s3://${MWAA_BUCKET}/requirements.txt"
106+
```
107+
108+
{{% /tab %}}
109+
{{% tab "Airflow 2.8.1" %}}
110+
111+
1. Download the Airflow constraints file:
112+
113+
```shell
114+
curl -o constraints.txt \
115+
"https://raw.githubusercontent.com/apache/airflow/constraints-2.8.1/constraints-3.11.txt"
116+
```
117+
118+
**Note**: Check the Python version for your MWAA environment. If your environment uses a different Python version, replace `3.11` in the URL.
119+
120+
2. Edit `constraints.txt` and update these package pins:
121+
122+
```text
123+
apache-airflow-providers-openlineage==1.14.0
124+
apache-airflow-providers-common-sql==1.20.0
125+
openlineage-integration-common==1.24.2
126+
openlineage-python==1.24.2
127+
openlineage-sql==1.24.2
128+
```
129+
130+
Also add the following line, since it is not present in the default constraints file:
131+
132+
```text
133+
apache-airflow-providers-common-compat==1.2.1
134+
```
135+
136+
3. Upload the updated constraints file to your MWAA S3 bucket:
137+
138+
```shell
139+
MWAA_BUCKET=<MWAA_BUCKET_NAME>
140+
aws s3 cp constraints.txt "s3://${MWAA_BUCKET}/dags/constraints.txt"
141+
```
142+
143+
4. Update `requirements.txt` to reference the constraints file and pin the upgraded package versions:
144+
145+
```text
146+
--constraint /usr/local/airflow/dags/constraints.txt
147+
apache-airflow-providers-openlineage==1.14.0
148+
apache-airflow-providers-common-sql==1.20.0
149+
apache-airflow-providers-common-compat==1.2.1
150+
openlineage-integration-common==1.24.2
151+
openlineage-python==1.24.2
152+
openlineage-sql==1.24.2
153+
```
154+
155+
5. Upload the updated `requirements.txt` file:
156+
157+
```shell
158+
aws s3 cp requirements.txt "s3://${MWAA_BUCKET}/requirements.txt"
159+
```
160+
161+
{{% /tab %}}
162+
{{% tab "Airflow 2.9.2" %}}
163+
164+
1. Download the Airflow constraints file:
165+
166+
```shell
167+
curl -o constraints.txt \
168+
"https://raw.githubusercontent.com/apache/airflow/constraints-2.9.2/constraints-3.12.txt"
169+
```
170+
171+
**Note**: Check the Python version for your MWAA environment. If your environment uses a different Python version, replace `3.12` in the URL.
172+
173+
2. Edit `constraints.txt` and update these package pins:
174+
175+
```text
176+
apache-airflow-providers-openlineage==2.2.0
177+
apache-airflow-providers-common-sql==1.21.0
178+
openlineage-integration-common==1.31.0
179+
openlineage-python==1.31.0
180+
openlineage-sql==1.31.0
181+
```
182+
183+
Also add the following line, since it is not present in the default constraints file:
184+
185+
```text
186+
apache-airflow-providers-common-compat==1.4.0
187+
```
188+
189+
3. Upload the updated constraints file to your MWAA S3 bucket:
190+
191+
```shell
192+
MWAA_BUCKET=<MWAA_BUCKET_NAME>
193+
aws s3 cp constraints.txt "s3://${MWAA_BUCKET}/dags/constraints.txt"
194+
```
195+
196+
4. Update `requirements.txt` to reference the constraints file and pin the upgraded package versions:
197+
198+
```text
199+
--constraint /usr/local/airflow/dags/constraints.txt
200+
apache-airflow-providers-openlineage==2.2.0
201+
apache-airflow-providers-common-sql==1.21.0
202+
apache-airflow-providers-common-compat==1.4.0
203+
openlineage-integration-common==1.31.0
204+
openlineage-python==1.31.0
205+
openlineage-sql==1.31.0
206+
```
207+
208+
5. Upload the updated `requirements.txt` file:
209+
210+
```shell
211+
aws s3 cp requirements.txt "s3://${MWAA_BUCKET}/requirements.txt"
212+
```
213+
214+
{{% /tab %}}
215+
{{< /tabs >}}
216+
217+
## Deploy and validate
218+
219+
1. In the MWAA console, update the environment so it picks up the new `requirements.txt`.
220+
2. Wait for the environment update to complete.
221+
3. Review MWAA install and startup logs. Confirm there are no dependency resolver errors.
222+
4. Trigger a DAG run that emits OpenLineage events.
223+
5. In Datadog, verify that runs appear on the [Data Observability: Jobs Monitoring][2] page.
224+
225+
## Troubleshooting
226+
227+
- If dependency resolution still fails, verify that the same package versions are pinned in both `constraints.txt` and `requirements.txt`.
228+
- If MWAA installs older versions, verify that `requirements.txt` points to `--constraint /usr/local/airflow/dags/constraints.txt` and that the file exists in your S3 `dags/` path.
229+
- If uploads or reads fail, verify that the MWAA execution role has access to the required S3 objects. For details, see [Amazon MWAA execution role][5].
230+
231+
## If you manage a custom image
232+
233+
If your team manages package versions through a custom Airflow image, update package versions in the image definition first, then regenerate `requirements.txt` from the built image so constraints and requirements stay aligned.
234+
235+
[1]: /data_observability/jobs_monitoring/airflow/?tab=amazonmwaa
236+
[2]: https://app.datadoghq.com/data-jobs/
237+
[3]: https://docs.aws.amazon.com/mwaa/latest/userguide/connections-packages.html
238+
[4]: https://airflow.apache.org/docs/apache-airflow-providers-openlineage/stable/index.html
239+
[5]: https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-create-role.html
240+
[6]: /resources/whl/apache_airflow_providers_openlineage-1.14.0-py3-none-any.whl
241+
[7]: /resources/whl/apache_airflow_providers_common_compat-1.2.2-py3-none-any.whl
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)