-
Notifications
You must be signed in to change notification settings - Fork 208
Description
Is this your first time submitting a feature request?
- I have read the expectations for open source contributors
- I have searched the existing issues, and I could not find an existing issue for this feature
- I am requesting a straightforward extension of existing dbt-bigquery functionality, rather than a Big Idea better suited to a discussion
Describe the feature
make_temp_relation will always use the schema of the target table to create the __dbt_tmp table. Currently there's no way to change this.
To keep production datasets clean, it would be ideal to have the option to specify a different dataset for temporary tables.
Additionally, in BigQuery, storage can be billed based on physical or logical bytes, but this is only defined at dataset level. Depending on the nature of the table, it might be more beneficial to set the dataset at physical or logical bytes. Sometimes, it might be beneficial to have the __dbt_tmp on a logical dataset while keeping the target dataset on a physicial billing. to achieve this we require having two different datasets.
Without touching make_temp_relation and going into dbt-adapters (alternative solution), one option could be something like the following, where we extend incremental.sql with an extra config variable:
{%- set temp_schema = config.get('temp_schema') -%}
{%- if temp_schema is not none-%}
{%- set temp_relation = this.incorporate(path={
"schema": temp_schema
}) -%}
{%- do create_schema(temp_relation) -%}
{% endif %}
{%- set tmp_relation = make_temp_relation(temp_relation) %}
Describe alternatives you've considered
We could move this change to the dpt-adapters instead, extending the make_temp_relation with an extra parameter temp_schema. Something like:
{% macro make_temp_relation(base_relation, suffix='__dbt_tmp', temp_schema=none) %}
{%- set temp_identifier = base_relation.identifier ~ suffix -%}
{%- set temp_relation = base_relation.incorporate(
path={"identifier": temp_identifier}) -%}
{%- if temp_schema is not none-%}
{%- set temp_relation = temp_relation.incorporate(path={
"schema": temp_schema
}) -%}
{%- do create_schema(temp_relation) -%}
{%- elif temp_schema_suffix is not none-%}
{%- set temp_schema = base_relation.schema ~ suffix -%}
{%- set temp_relation = temp_relation.incorporate(path={
"schema": temp_schema
}) -%}
{%- do create_schema(temp_relation) -%}
{% endif %}
{{ return(temp_relation) }}
{% endmacro %}
Who will this benefit?
Users in bigquery who want to optimise for storage costs and clean datasets.
Are you interested in contributing this feature?
Yes
Anything else?
No response