You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.rst
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,16 +41,17 @@ Goals
41
41
* Collect lineage events produced by OpenLineage clients & integrations (Spark, Airflow).
42
42
* Support consuming large amounts of lineage events, by using Kafka as event buffer and storing data in tables partitioned by event timestamp.
43
43
* Store operation-grained events (instead of job grained `Marquez <https://marquezproject.ai/>`_), for better detalization.
44
-
* Provide API for building run ↔ dataset lineage, as well as parent run → children run lineage.
45
-
* Ability to build lineage graph with specific time boundaries (unlike Marquez there lineage is build only for last job run).
46
-
* Ability to build lineage graph with different granularity. e.g. merge all individual Spark operations into Spark applicationId or Spark applicationName.
44
+
* Provide API for fetching run ↔ dataset lineage.
45
+
* Allow building lineage graph with specific time boundaries (unlike Marquez there lineage is build only for last job run).
46
+
* Allow building lineage graph with different granularity. e.g. merge all individual Spark operations into Spark applicationId or Spark applicationName.
47
+
* Include column-level lineage into lineage graph.
47
48
48
49
Non-goals
49
50
---------
50
51
51
52
* This is **not** a Data Catalog. Use `Datahub <https://datahubproject.io/>`_ or `OpenMetadata <https://open-metadata.org/>`_ instead.
52
53
* Static Data Lineage like view → table is not supported.
53
-
* Currently column-level lineage is collected by OpenLineage, but not yet consumed by Data.Rentgen.
54
+
* Job/run/operation are always a part of lineage graph. Hiding them to produce dataset → dataset lineage is not supported for now.
0 commit comments