-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Description
Apache Airflow version
3.0.3
If "Other Airflow 2 version" selected, which one?
No response
What happened?
When Opentelemetry trace is enabled in Airflow 3.0.2, we are experiencing memory leaks in schedulers, triggerers.
We tested this by deploying three schedulers:
- scheduler-1 with otel-metrics setup (
AIRFLOW__METRICS__OTEL_ON=true, etc) - scheduler-2 with otel-traces setup (
AIRFLOW__TRACES__OTEL_ON=true, etc) - scheduler-3 without otel setup (
no metrics, no traces)
These schedulers have identical DAG files and run in the same k8s namespace.
And we found that scheduler-2 with otel-traces setup shows the typical memory leak trend in the Grafana dashboard ( I can't post the screenshot here sorry :( )
After some digging, we found that the issue is already reported and fixed in the opentelemtry-python repo.
- Upstream issue:
Unable to release memoryUnable to release memory open-telemetry/opentelemetry-python#4220 (otel 1.27.0)
Fix memory leak in exporterBatchLogRecordProcessor objects are not able to be garbage collected open-telemetry/opentelemetry-python#4422 (otel 1.30.0) - Fix released:
opentelemetry-api==1.35.0(latest) https://github.com/open-telemetry/opentelemetry-python/releases/tag/v1.35.0
- Current OTEL version in Airflow constraint:
1.27.0
https://raw.githubusercontent.com/apache/airflow/constraints-3.0.2/constraints-3.12.txt (we are using Airflow 3.0.2)
https://raw.githubusercontent.com/apache/airflow/constraints-3.0.3/constraints-3.12.txt
...
opentelemetry-api==1.27.0
opentelemetry-exporter-otlp-proto-common==1.27.0
opentelemetry-exporter-otlp-proto-grpc==1.27.0
opentelemetry-exporter-otlp-proto-http==1.27.0
opentelemetry-exporter-otlp==1.27.0
opentelemetry-exporter-prometheus==0.48b0
opentelemetry-proto==1.27.0
opentelemetry-sdk==1.27.0
opentelemetry-semantic-conventions==0.48b0
...
Could you please consider upgrading the latest version of Opentelemetry packages in Airflow future version to prevent memory leaking.
What you think should happen instead?
No response
How to reproduce
Airflow OTEL trace setup
Operating System
debian 12
Versions of Apache Airflow Providers
No response
Deployment
Other Docker-based deployment
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct