Is this your first time submitting a feature request?
Describe the feature
So far there are few places where dbt-bigquery access partitions information:
- In incremental materialization, _dbt_max_partitionis declared (when used in a model) and stores the max partition
- In get_partitions_metadatamacro that returns the partitions information for input model
- In incremental insert_overwritematerialization and before the MERGE statement with dynamic partitions to store them indbt_partitions_for_replacementand use it in the predicate
- In incremental insert_overwritematerialization and before the copy partitions statement with dynamic partitions to iterate on the partitions to replace
Allowing to use INFORMATION_SCHEMA data would reduce the amount of data scanned.
More details are available on https://cloud.google.com/bigquery/docs/information-schema-partitions.
As this table is still in Preview, it's likely safer to provide as an alternative to current method.
To access that new method, adding partition_information settings to partition_by config block so that without value the default is current behavior or with explicit value "model" or use information schema with value "information_schema" to use the new approach.
This issue is similar to #286 but intend to rely on existing macros and solutions as well replacing under the hood accesses to improve overall performances.
Describe alternatives you've considered
We could try to set the config at connection level (profiles.yml) but doing so would change the API for existing macros and wouldn't be as convenient as accessing model config block.
Who will this benefit?
Anyone using incremental strategy with fairly large partitions as the current approach can be costly and and slightly slow compared to the metadata approach.
Are you interested in contributing this feature?
yes
Anything else?
No response