Skip to content

feat(glue): add JDBC upstream lineage for Glue jobs#16505

Merged
alokr-dhub merged 24 commits intomasterfrom
feature/support-glue-job-lineage-for-upstream-jdbc-connectors
Apr 1, 2026
Merged

feat(glue): add JDBC upstream lineage for Glue jobs#16505
alokr-dhub merged 24 commits intomasterfrom
feature/support-glue-job-lineage-for-upstream-jdbc-connectors

Conversation

@alokr-dhub
Copy link
Copy Markdown
Contributor

@alokr-dhub alokr-dhub commented Mar 10, 2026

Summary

Glue jobs that read from or write to JDBC sources (Postgres, MySQL, MariaDB, Redshift, Oracle, SQL Server) now produce lineage edges in DataHub. Previously these nodes fell through to the "unsupported connector" path and were silently skipped.
Added JDBC_PLATFORM_MAP to map JDBC protocol names to DataHub platform names, and JDBC_DEFAULT_SCHEMA to inject the correct default schema (public for Postgres/Redshift, dbo for SQL Server) when dbtable has no schema prefix.
The dataset URN is constructed as database.schema.table to match what the native source connectors (e.g. postgres, mysql) produce, enabling lineage stitching without additional configuration.
No new dataset MCEs are emitted for JDBC nodes — the datasets are expected to already exist from a separate source connector ingestion run.

  • The PR conforms to DataHub's Contributing Guideline (particularly PR Title Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Mar 10, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 85.81560% with 20 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ingestion/src/datahub/ingestion/source/aws/glue.py 85.81% 20 Missing ⚠️

📢 Thoughts on this report? Let us know!

@rajatoss
Copy link
Copy Markdown
Member

rajatoss commented Mar 10, 2026

Connector Tests Results

All connector tests passed for commit b215298

View full test logs →

To skip connector tests, add the skip-connector-tests label (org members only).

Autogenerated by the connector-tests CI pipeline.

@alokr-dhub alokr-dhub marked this pull request as ready for review March 10, 2026 07:11
@maggiehays maggiehays added the pending-submitter-response Issue/request has been reviewed but requires a response from the submitter label Mar 10, 2026
@gabe-lyons
Copy link
Copy Markdown
Contributor

Linear: ING-1866

Thanks for your contribution! We have created an internal ticket to track this PR. A member of the core DataHub team will be assigned to review it within the next few business days - you will get a follow-up comment once a reviewer is assigned.

@alokr-dhub alokr-dhub marked this pull request as draft March 10, 2026 17:14
@alokr-dhub alokr-dhub marked this pull request as ready for review March 10, 2026 17:14
@github-actions
Copy link
Copy Markdown
Contributor

Linear: ING-1869

Thanks for your contribution! We have created an internal ticket to track this PR. A member of the core DataHub team will be assigned to review it within the next few business days - you will get a follow-up comment once a reviewer is assigned.

@github-actions github-actions bot requested a review from treff7es March 10, 2026 20:11
@github-actions
Copy link
Copy Markdown
Contributor

Your PR has been assigned to @treff7es (tamas) for review (ING-1866).

@alokr-dhub
Copy link
Copy Markdown
Contributor Author

Marking this as draft for now for any upcomming edge cases

@github-actions
Copy link
Copy Markdown
Contributor

Linear: ING-1932

@alwaysmeticulous
Copy link
Copy Markdown

alwaysmeticulous bot commented Mar 24, 2026

🔴 Meticulous spotted visual differences in 35 of 1809 screens tested: view and approve differences detected.

Meticulous evaluated ~8 hours of user flows against your PR.

Last updated for commit bc78c5a fix: review comments. This comment will update as new commits are pushed.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Bundle Report

Changes will increase total bundle size by 17.67kB (0.08%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 22.7MB 17.67kB (0.08%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 4.54kB 12.45MB 0.04%
assets/fabriclogo-*.svg (New) 8.86kB 8.86kB 100.0% 🚀
assets/fabricdatafactorylogo-*.svg (New) 4.27kB 4.27kB 100.0% 🚀

@maggiehays maggiehays added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Mar 27, 2026
@maggiehays maggiehays added pending-submitter-merge and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Mar 30, 2026
@alokr-dhub alokr-dhub merged commit b8bba2f into master Apr 1, 2026
72 checks passed
@alokr-dhub alokr-dhub deleted the feature/support-glue-job-lineage-for-upstream-jdbc-connectors branch April 1, 2026 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants