Skip to content

[Feature] Support Data Profiling in dbt #543

@syou6162

Description

@syou6162

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt-bigquery functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Dataplex data profiling lets you identify common statistical characteristics of the columns in your BigQuery tables. This information helps you to understand and analyze your data more effectively.

You can set data profiling from the GUI or API, but if you specify materialized='table', the data profiling settings will be deleted because the table will be recreated. If data profiling could be set within dbt after the table is created, it would make it easier for dbt users to use the data profiling function.

Describe alternatives you've considered

No response

Who will this benefit?

People who use BigQuery tables built with dbt. I think this will be a useful feature for data users, especially analysts and business developers, as they can see the statistics for each column without having to write a query.

Are you interested in contributing this feature?

Yes, very much! I'm interested in contributing to dbt, so I plan to send a pull request soon. I think I can do it if I refer to the implementation that supports BigQuery's policy tag.

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions