DataHub v1.3.0
Release Highlights
DataHub v1.3.0 is packed with exciting updates, including:
- UI Enhancements: Introducing a customizable home page that allows flexible landing page configuration, along with customizable entity summary tabs that give users direct control over which information appears in the summary view.
- Ingestion Connectors: Added support for Google BigQuery in Tableau ingestion and introduced Excel and Snaplogic ingestion sources. Enhanced column lineage extraction for Looker and support for dynamic tables in Snowflake ingestion.
- SDK Features: Exposed view definitions in the dataset SDK for better usability and added support for fine-grained lineage patching.
- Platform Improvements: Enhanced lineage processing for performance and scalability, added support for Metadata Change Logs in DataHub Cloud Events, and introduced an OIDC OAuth authenticator for service account access.
Changelog
User Interface & Experience
This release includes significant improvements to the user interface and user experience:
Custom Home Page
Customizable home page allows organization admins to configure the landing page experience for their users, pinning custom lists of assets, documentation, and links to guide users to the right data from the start. End users can refine these defaults by personalizing modules to fit their unique workflows and needs.
This feature is disabled by default. You can enable it by setting the following environment variable in the datahub-gms container:
SHOW_HOME_PAGE_REDESIGN=true
- Introduces a framework for customizable page templates and modules in the UI. [#14029]
- Introduces a new hierarchy module for the custom home page. [#14143]
- Introduces new Asset Collection module on the homepage. [#14050]
- Added support for Quick Link modules on the new home page. [#14141]
Customizable Entity Summary Tabs
Customizable Summary Tabs give asset owners direct control over what appears on summary tabs for Domains, Glossary Terms, and Data Products. Asset owners can now pin specific properties, documentation, and related assets to create tailored summary views for these key organizational assets.
- Added a new summary tab to the glossary node entity. [#14541]
- Introduced inline documentation editor for the summary tab. [#14514]
- Added new summary tab for related entities in the UI. [#14463]
- Added new link functionalities in the summary tab. [#14540]
Performance and Navigation Improvements
- Show context paths for Data Products. [#14802]
- Improved domain state management for better scalability with many domains. [#14478]
Misc
- Replaced Home Page product tour with a new Welcome modal. [#14299]
- Added deprecation banner for the V1 UI on the home page. [#14131]
- Fixed bug in merging owners with multiple ownership types. [#14543]
- Introduced access management tab for containers in the UI. [#14122]
- Introduced stats tab v2 for OSS with new UI features. [#13431]
- Added filter for status on run history in the UI. [#14851]
- Included dataProcessInstance in policies UI. [#14880]
Metadata Ingestion
We're continuously improving our integrations to add new capabilities and squash bugs.
New Sources
- Added new Excel ingestion source for transactional and analytical use. [#13261]
- Added Snaplogic as a new source for metadata ingestion. [#14231]
Existing Sources
- BigQuery: Added support for Google BigQuery connection type in Tableau. [#14080]
- Databricks: Fixed quoting issues for catalog/schema/table names in Databricks ingestion. [#14203]
- Iceberg:
- Looker: Enhanced column lineage extraction for Looker. [#14826]
- MySQL: Added support for stored procedures in MySQL ingestion. [#14274]
- Postgres: Added support for stored procedures in Postgres integration. [#14102]
- PowerBI: Added ODBC SQL query parsing support for PowerBI ingestion. [#13752]
- S3: Extended file sink to support writing files to S3. [#14160]
- Snowflake:
- Unity: Added support for MLModel and MLModel version in Unity Catalog connector. [#14594]
Misc. Ingestion Improvements
- Removed support for Pydantic v1, migrated to Pydantic v2. [#14014]
- Added support for incremental lineage in SQL queries. [#14548]
- Added support for slice on OpenAPI scroll APIs for OpenSearch. [#14510]
- Improved metadata handling in DataHub ingestion source. [#14643]
- Introduced new browsePathsV2transformer for multiple entity types. [#14825]
DataHub Python SDK
Improvements and new features for the DataHub SDK:
- Refactored dataset lineage patch with reusable code and added smoke tests. [#14377]
- Exposed view definitions in the dataset SDK for better usability. [#14197]
- Added container support for charts and dashboards in the SDK. [#14641]
- Fixed handling of null platform in SDK v2 lineage. [#14784]
Platform & Backend
Platform improvements and backend enhancements:
- Added PDL annotations for upstreams in ingestion completion coverage. [#14241]
- Made usage fields searchable for ingestion validation. [#14400]
- Improved API error handling for better response clarity. [#14795]
- Refactored Kafka topic management and setup process. [#14564]
- Added support for MCL generation from a Change Data Capture source (Debezium). [#14824]
- Enhanced lineage processing system for performance and scalability. [#14609]
- Added base path support for improved routing. [#14866]
- Added support for Metadata Change Log in DataHub Cloud Events Source. [#14497]
- Added OIDC OAuth authenticator for service account access to DataHub. [#14707]
- Added filter support in /scrollAPI for OpenAPI v3. [#14524]
Documentation
Documentation updates and improvements:
- Added search examples for lineage and usage filters. [#14487]
- Updated Okta SCIM integration documentation. [#14301]
- Added README for schema field documentation propagation action. [#14180]
Thank You to Our Contributors!
First-Time Contributors
@EmmetAVS @MaciekRakowski @SalimAbdul-snaplogic @SaravanaKumarG-1365462 @abdullahtariqq @andrewsrajasekar @hector-stratebi @kammillam @pezik1 @rolands-kundzins-sw @serragnoli @simoncjl @tulikabhatt @vinayakhulawale @vpipkt @Anshul759
Repeat Contributors
@bmaquet @kartikey-visa @kevin1chun @kyungryun @ligfx @mihai103 @mminichino @purnimagarg1 @relaxedboi @sleeperdeep @tkdrahn @zhixuanjia
Project Maintainers
@RyanHolstien @abedatahub @acrylJonny @alexsku @annadoesdesign @anshbansal @asikowitz @askumar27 @benjiaming @brock-acryl @chakru-r @chriscollins3456 @david-leifker @deepgarg760 @esteban @gabe-lyons @hsheth2 @jayacryl @jjoyce0510 @kevinkarchacryl @maggiehays @mayurinehate @pedro93 @sakethvarma397 @sgomezvillamor @shirshanka @skrydal @treff7es @yoonhyejin @AdrianMachado @NehaGslab @grayayer @jatherley @JohnRTurner @v-tarasevich-blitz-brain
View the full changelog:
👉 https://github.com/datahub-project/datahub/releases/tag/v1.3.0