@@ -9,8 +9,10 @@ Requirements
99------------
1010
1111* `Apache Airflow <https://airflow.apache.org/ >`_ 2.x or 3.x
12- * OpenLineage 1.19.0 or higher, recommended 1.34 .0+
12+ * OpenLineage 1.19.0 or higher, recommended 1.37 .0+
1313* OpenLineage integration for Airflow (see below)
14+ * Running :ref: `message-broker `
15+ * (Optional) :ref: `http2kafka `
1416
1517Entity mapping
1618--------------
@@ -25,15 +27,27 @@ Install
2527
2628* For Airflow 2.7 or higher, use `apache-airflow-providers-openlineage <https://airflow.apache.org/docs/apache-airflow-providers-openlineage/stable/index.html >`_ 1.9.0 or higher:
2729
28- .. code :: console
30+ .. tabs ::
2931
30- $ pip install "apache-airflow-providers-openlineage>=2.3.0" "openlineage-python[kafka]>=1.34.0" zstd
32+ .. code-tab :: console KafkaTransport
33+
34+ $ pip install "apache-airflow-providers-openlineage>=2.6.1" "openlineage-python[kafka]>=1.37.0" zstd
35+
36+ .. code-tab :: console HttpTransport (requires HTTP2Kafka)
37+
38+ $ pip install "apache-airflow-providers-openlineage>=2.6.1"
3139
3240* For Airflow 2.1.x-2.6.x, use `OpenLineage integration for Airflow <https://openlineage.io/docs/integrations/airflow/ >`_ 1.19.0 or higher
3341
34- .. code :: console
42+ .. tabs ::
43+
44+ .. code-tab :: console KafkaTransport
45+
46+ $ pip install "openlineage-airflow>=1.37.0" "openlineage-python[kafka]>=1.37.0" zstd
3547
36- $ pip install "openlineage-airflow>=1.34.0" "openlineage-python[kafka]>=1.34.0" zstd
48+ .. code-tab :: console HttpTransport (requires HTTP2Kafka)
49+
50+ $ pip install "openlineage-airflow>=1.37.0"
3751
3852Setup
3953-----
@@ -43,19 +57,36 @@ Via OpenLineage config file
4357
4458* Create ``openlineage.yml `` file with content like:
4559
46- .. code :: yaml
47-
48- transport :
49- type : kafka
50- topic : input.runs
51- config :
52- bootstrap.servers : localhost:9093
53- security.protocol : SASL_PLAINTEXT
54- sasl.mechanism : SCRAM-SHA-256
55- sasl.username : data_rentgen
56- sasl.password : changeme
57- compression.type : zstd
58- acks : all
60+ .. tabs ::
61+
62+ .. code-tab :: yaml KafkaTransport
63+
64+ transport:
65+ type: kafka
66+ topic: input.runs
67+ config:
68+ # should be accessible from Airflow scheduler
69+ bootstrap.servers: localhost:9093
70+ security.protocol: SASL_PLAINTEXT
71+ sasl.mechanism: SCRAM-SHA-256
72+ # Kafka auth credentials
73+ sasl.username: data_rentgen
74+ sasl.password: changeme
75+ compression.type: zstd
76+ acks: all
77+
78+ .. code-tab :: yaml HttpTransport (requires HTTP2Kafka)
79+
80+ transport:
81+ type: http
82+ # http2kafka URL, should be accessible from Airflow scheduler
83+ url: http://localhost:8002
84+ endpoint: /v1/openlineage
85+ compression: gzip
86+ auth:
87+ type: api_key
88+ # create a PersonalToken, and pass it here
89+ apiKey: personal_token_AAAAAAAAAAAA.BBBBBBBBBBBBBBBBBBBBBBB.CCCCCCCCCCCCCCCCCCCCC
5990
6091* Pass path to config file via ``AIRFLOW__OPENLINEAGE__CONFIG_PATH `` environment variable:
6192
@@ -69,24 +100,45 @@ Via Airflow config file
69100
70101Setup OpenLineage integration using ``airflow.cfg `` config file:
71102
72- .. code :: ini
103+ .. tabs ::
104+
105+ .. code-tab :: ini KafkaTransport
73106
74107 [openlineage]
75108 # set here address of Airflow Web UI
76109 namespace = http://airflow.hostname.fqdn:8080
77- # set here Kafka connection address & credentials
110+ # set here Kafka broker address & auth credentials
78111 transport = {"type": "kafka", "config": {"bootstrap.servers": "localhost:9093", "security.protocol": "SASL_PLAINTEXT", "sasl.mechanism": "SCRAM-SHA-256", "sasl.username": "data_rentgen", "sasl.password": "changeme", "compression.type": "zstd", "acks": "all"}, "topic": "input.runs", "flush": true}
79112
113+ .. code-tab :: ini HttpTransport (requires HTTP2Kafka)
114+
115+ [openlineage]
116+ # set here address of Airflow Web UI
117+ namespace = http://airflow.hostname.fqdn:8080
118+ # set here HTTP2Kafka url & create PersonalToken
119+ transport = {"type": "http", "url": "http://localhost:8002", "endpoint": "/v1/openlineage", "compression": "gzip", "auth": {"type": "api_key", "apiKey": "personal_token_AAAAAAAAAAAA.BBBBBBBBBBBBBBBBBBBBBBB.CCCCCCCCCCCCCCCCCCCCC"}}
80120
81121Via Airflow environment variables
82122~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83123
84- Set environment variables for all Airflow components (e.g. via ``docker-compose.yml ``)
124+ Set environment variables for all Airflow components (e.g. via ``docker-compose.yml ``). Depending on your shell, you may remove single quotes
85125
86- .. code :: ini
126+ .. tabs ::
127+
128+ .. code-tab :: bash KafkaTransport
129+
130+ # set here address of Airflow Web UI
131+ AIRFLOW__OPENLINEAGE__NAMESPACE='http://airflow.hostname.fqdn:8080'
132+ # set here Kafka broker address & auth credentials
133+ AIRFLOW__OPENLINEAGE__TRANSPORT='{"type": "kafka", "config": {"bootstrap.servers": "localhost:9093", "security.protocol": "SASL_PLAINTEXT", "sasl.mechanism": "SCRAM-SHA-256", "sasl.username": "data_rentgen", "sasl.password": "changeme", "compression.type": "zstd", "acks": "all"}, "topic": "input.runs", "flush": true}'
134+
135+ .. code-tab :: bash HttpTransport (requires HTTP2Kafka)
136+
137+ # set here address of Airflow Web UI
138+ AIRFLOW__OPENLINEAGE__NAMESPACE='http://airflow.hostname.fqdn:8080'
139+ # set here HTTP2Kafka url & create PersonalToken
140+ AIRFLOW__OPENLINEAGE__TRANSPORT='{"type": "http", "url": "http://localhost:8002", "endpoint": "/v1/openlineage", "compression": "gzip", "auth": {"type": "api_key", "apiKey": "personal_token_AAAAAAAAAAAA.BBBBBBBBBBBBBBBBBBBBBBB.CCCCCCCCCCCCCCCCCCCCC"}}'
87141
88- AIRFLOW__OPENLINEAGE__NAMESPACE =http://airflow.hostname.fqdn:8080
89- AIRFLOW__OPENLINEAGE__TRANSPORT ={" type" : " kafka" , " config" : {" bootstrap.servers" : " localhost:9093" , " security.protocol" : " SASL_PLAINTEXT" , " sasl.mechanism" : " SCRAM-SHA-256" , " sasl.username" : " data_rentgen" , " sasl.password" : " changeme" , " compression.type" : " zstd" , " acks" : " all" }, " topic" : " input.runs" , " flush" : true}
90142
91143Airflow 2.1.x and 2.2.x
92144~~~~~~~~~~~~~~~~~~~~~~~
0 commit comments