feat(glue): add JDBC upstream lineage for Glue jobs#16505
feat(glue): add JDBC upstream lineage for Glue jobs#16505alokr-dhub merged 24 commits intomasterfrom
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Connector Tests ResultsAll connector tests passed for commit To skip connector tests, add the Autogenerated by the connector-tests CI pipeline. |
|
Linear: ING-1866 Thanks for your contribution! We have created an internal ticket to track this PR. A member of the core DataHub team will be assigned to review it within the next few business days - you will get a follow-up comment once a reviewer is assigned. |
|
Linear: ING-1869 Thanks for your contribution! We have created an internal ticket to track this PR. A member of the core DataHub team will be assigned to review it within the next few business days - you will get a follow-up comment once a reviewer is assigned. |
|
Your PR has been assigned to @treff7es (tamas) for review (ING-1866). |
|
Marking this as draft for now for any upcomming edge cases |
|
Linear: ING-1932 |
|
🔴 Meticulous spotted visual differences in 35 of 1809 screens tested: view and approve differences detected. Meticulous evaluated ~8 hours of user flows against your PR. Last updated for commit |
Bundle ReportChanges will increase total bundle size by 17.67kB (0.08%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: datahub-react-web-esmAssets Changed:
|
…eam-jdbc-connectors
…eam-jdbc-connectors
…eam-jdbc-connectors
Summary
Glue jobs that read from or write to JDBC sources (Postgres, MySQL, MariaDB, Redshift, Oracle, SQL Server) now produce lineage edges in DataHub. Previously these nodes fell through to the "unsupported connector" path and were silently skipped.
Added JDBC_PLATFORM_MAP to map JDBC protocol names to DataHub platform names, and JDBC_DEFAULT_SCHEMA to inject the correct default schema (public for Postgres/Redshift, dbo for SQL Server) when dbtable has no schema prefix.
The dataset URN is constructed as database.schema.table to match what the native source connectors (e.g. postgres, mysql) produce, enabling lineage stitching without additional configuration.
No new dataset MCEs are emitted for JDBC nodes — the datasets are expected to already exist from a separate source connector ingestion run.