Skip to content

v1.3.0

Latest

Choose a tag to compare

@RyanHolstien RyanHolstien released this 07 Oct 20:44

DataHub v1.3.0

Release Highlights

DataHub v1.3.0 is packed with exciting updates, including:

  • UI Enhancements: Introducing a customizable home page that allows flexible landing page configuration, along with customizable entity summary tabs that give users direct control over which information appears in the summary view.
  • Ingestion Connectors: Added support for Google BigQuery in Tableau ingestion and introduced Excel and Snaplogic ingestion sources. Enhanced column lineage extraction for Looker and support for dynamic tables in Snowflake ingestion.
  • SDK Features: Exposed view definitions in the dataset SDK for better usability and added support for fine-grained lineage patching.
  • Platform Improvements: Enhanced lineage processing for performance and scalability, added support for Metadata Change Logs in DataHub Cloud Events, and introduced an OIDC OAuth authenticator for service account access.

Changelog

User Interface & Experience

This release includes significant improvements to the user interface and user experience:

Custom Home Page

Customizable home page allows organization admins to configure the landing page experience for their users, pinning custom lists of assets, documentation, and links to guide users to the right data from the start. End users can refine these defaults by personalizing modules to fit their unique workflows and needs.

This feature is disabled by default. You can enable it by setting the following environment variable in the datahub-gms container:
SHOW_HOME_PAGE_REDESIGN=true

  • Introduces a framework for customizable page templates and modules in the UI. [#14029]
  • Introduces a new hierarchy module for the custom home page. [#14143]
  • Introduces new Asset Collection module on the homepage. [#14050]
  • Added support for Quick Link modules on the new home page. [#14141]

Customizable Entity Summary Tabs

Customizable Summary Tabs give asset owners direct control over what appears on summary tabs for Domains, Glossary Terms, and Data Products. Asset owners can now pin specific properties, documentation, and related assets to create tailored summary views for these key organizational assets.

  • Added a new summary tab to the glossary node entity. [#14541]
  • Introduced inline documentation editor for the summary tab. [#14514]
  • Added new summary tab for related entities in the UI. [#14463]
  • Added new link functionalities in the summary tab. [#14540]

Performance and Navigation Improvements

  • Show context paths for Data Products. [#14802]
  • Improved domain state management for better scalability with many domains. [#14478]

Misc

  • Replaced Home Page product tour with a new Welcome modal. [#14299]
  • Added deprecation banner for the V1 UI on the home page. [#14131]
  • Fixed bug in merging owners with multiple ownership types. [#14543]
  • Introduced access management tab for containers in the UI. [#14122]
  • Introduced stats tab v2 for OSS with new UI features. [#13431]
  • Added filter for status on run history in the UI. [#14851]
  • Included dataProcessInstance in policies UI. [#14880]

Metadata Ingestion

We're continuously improving our integrations to add new capabilities and squash bugs.

New Sources

  • Added new Excel ingestion source for transactional and analytical use. [#13261]
  • Added Snaplogic as a new source for metadata ingestion. [#14231]

Existing Sources

  • BigQuery: Added support for Google BigQuery connection type in Tableau. [#14080]
  • Databricks: Fixed quoting issues for catalog/schema/table names in Databricks ingestion. [#14203]
  • Iceberg:
    • Improved error handling for Iceberg source processing. [#14731]
    • Includes explicit extras in dependencies for AWS Glue support. [#14766]
    • Adds reporting for partition key field info in Iceberg Rest Catalog. [#14332]
  • Looker: Enhanced column lineage extraction for Looker. [#14826]
  • MySQL: Added support for stored procedures in MySQL ingestion. [#14274]
  • Postgres: Added support for stored procedures in Postgres integration. [#14102]
  • PowerBI: Added ODBC SQL query parsing support for PowerBI ingestion. [#13752]
  • S3: Extended file sink to support writing files to S3. [#14160]
  • Snowflake:
    • Formal support for dynamic tables in Snowflake ingestion. [#13542]
    • Added new pushdown_allow_usernames config for Snowflake connector. [#14428]
    • Added support for Snowflake China region (cn-northwest-1). [#14434]
    • Fixed schema extraction for views in Snowflake under specific conditions. [#14601]
  • Unity: Added support for MLModel and MLModel version in Unity Catalog connector. [#14594]

Misc. Ingestion Improvements

  • Removed support for Pydantic v1, migrated to Pydantic v2. [#14014]
  • Added support for incremental lineage in SQL queries. [#14548]
  • Added support for slice on OpenAPI scroll APIs for OpenSearch. [#14510]
  • Improved metadata handling in DataHub ingestion source. [#14643]
  • Introduced new browsePathsV2 transformer for multiple entity types. [#14825]

DataHub Python SDK

Improvements and new features for the DataHub SDK:

  • Refactored dataset lineage patch with reusable code and added smoke tests. [#14377]
  • Exposed view definitions in the dataset SDK for better usability. [#14197]
  • Added container support for charts and dashboards in the SDK. [#14641]
  • Fixed handling of null platform in SDK v2 lineage. [#14784]

Platform & Backend

Platform improvements and backend enhancements:

  • Added PDL annotations for upstreams in ingestion completion coverage. [#14241]
  • Made usage fields searchable for ingestion validation. [#14400]
  • Improved API error handling for better response clarity. [#14795]
  • Refactored Kafka topic management and setup process. [#14564]
  • Added support for MCL generation from a Change Data Capture source (Debezium). [#14824]
  • Enhanced lineage processing system for performance and scalability. [#14609]
  • Added base path support for improved routing. [#14866]
  • Added support for Metadata Change Log in DataHub Cloud Events Source. [#14497]
  • Added OIDC OAuth authenticator for service account access to DataHub. [#14707]
  • Added filter support in /scroll API for OpenAPI v3. [#14524]

Documentation

Documentation updates and improvements:

  • Added search examples for lineage and usage filters. [#14487]
  • Updated Okta SCIM integration documentation. [#14301]
  • Added README for schema field documentation propagation action. [#14180]

Thank You to Our Contributors!

First-Time Contributors

@EmmetAVS @MaciekRakowski @SalimAbdul-snaplogic @SaravanaKumarG-1365462 @abdullahtariqq @andrewsrajasekar @hector-stratebi @kammillam @pezik1 @rolands-kundzins-sw @serragnoli @simoncjl @tulikabhatt @vinayakhulawale @vpipkt @Anshul759

Repeat Contributors

@bmaquet @kartikey-visa @kevin1chun @kyungryun @ligfx @mihai103 @mminichino @purnimagarg1 @relaxedboi @sleeperdeep @tkdrahn @zhixuanjia

Project Maintainers

@RyanHolstien @abedatahub @acrylJonny @alexsku @annadoesdesign @anshbansal @asikowitz @askumar27 @benjiaming @brock-acryl @chakru-r @chriscollins3456 @david-leifker @deepgarg760 @esteban @gabe-lyons @hsheth2 @jayacryl @jjoyce0510 @kevinkarchacryl @maggiehays @mayurinehate @pedro93 @sakethvarma397 @sgomezvillamor @shirshanka @skrydal @treff7es @yoonhyejin @AdrianMachado @NehaGslab @grayayer @jatherley @JohnRTurner @v-tarasevich-blitz-brain


View the full changelog:
👉 https://github.com/datahub-project/datahub/releases/tag/v1.3.0