-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is this a new bug in dbt-core?
- I believe this is a new bug in dbt-core
- I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
I can see some other bugs around snapshots and hard_deletes:new_record but I can't find this specific issue:
I'm adding a new column to an ephemeral model that is feeding a yml defined snpashot. (where the snapshot has already been run).
On running the snapshot, it fails with a type mismatch error in the UNION ALL
Database Error in snapshot snapshot_monzo_change_log_change_id
Column 4 in UNION ALL has incompatible types: STRING, STRING, STRING, INT64 at [238:5]
It looks like this occurs du to the deletion_records CTE:
deletion_records as (
select 'insert' as dbt_change_type,
snapshotted_data.airtable_id,
snapshotted_data.change_id,
NULL as proposal_name, -- New column doesn't exist in existing table
snapshotted_data.dbt_unique_key as dbt_unique_key,
...
)
Expected Behavior
Expected that the new column will be added to the snapshot.
I suspect that the type of the column needs to be inferred from the newly added column and the CTE look something like
deletion_records as (
select 'insert' as dbt_change_type,
snapshotted_data.airtable_id,
snapshotted_data.change_id,
CAST(NULL AS STRING) as proposal_name, -- New column doesn't exist in existing table
snapshotted_data.dbt_unique_key as dbt_unique_key,
...
)
Possibly this is a bigquery specific issue, but I'm not totally sure.
Steps To Reproduce
Create ephemeral model:
{{
config(materialized="ephemeral")
}}
SELECT
id AS airtable_id,
change_id
FROM {{ source('some_source', 'some_table') }}
Create yml snapshot based on model
version: 2
snapshots:
- name: snapshot_test
relation: ref('eph_model')
config:
strategy: check
unique_key: airtable_id
check_cols: all
hard_deletes: new_record
Run the snapshot
dbt snapshot --select snapshot_test
Add new column to the ephemeral model:
SELECT
id AS airtable_id,
change_id,
proposal_name -- New column added
FROM {{ source('some_source', 'some_table') }}
Run the snapshot again, and the above error is seen.
Relevant log output
Database Error in snapshot snapshot_my_table (snapshots/snapshot_my_table.yml)
Column 4 in UNION ALL has incompatible types: STRING, STRING, STRING, INT64 at [238:5]
1 of 1 ERROR snapshotting dev_db.dbt_dev.snapshot_my_table [ERROR in 5.77s]
create or replace table `prod`.`snapshot_my_table__dbt_tmp`
OPTIONS(
expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 12 hour)
)
as (
with snapshot_query as (
with __dbt__cte__eph_my_table as (
SELECT
id AS airtable_id,
change_id,
proposal_name -- NEW COLUMN
FROM `prod`.`my_source_table`
)
select * from __dbt__cte__eph_my_table
),
snapshotted_data as (
select *,
airtable_id as dbt_unique_key
from `prod`.`snapshot_my_table`
where valid_to_timestamp is null
-- NOTE: This table does NOT have the 'proposal_name' column yet
),
insertions_source_data as (
select *,
airtable_id as dbt_unique_key,
current_timestamp() as _meta_insertion_timestamp,
current_timestamp() as valid_from_timestamp,
coalesce(nullif(current_timestamp(), current_timestamp()), null) as valid_to_timestamp,
to_hex(md5(concat(coalesce(cast(airtable_id as string), ''), '|',coalesce(cast(current_timestamp() as string), '')))) as episode_id
from snapshot_query
),
updates_source_data as (
select *,
airtable_id as dbt_unique_key,
current_timestamp() as _meta_insertion_timestamp,
current_timestamp() as valid_from_timestamp,
current_timestamp() as valid_to_timestamp
from snapshot_query
),
deletes_source_data as (
select *,
airtable_id as dbt_unique_key
from snapshot_query
),
insertions as (
select
'insert' as dbt_change_type,
source_data.*,
'False' as is_deleted
from insertions_source_data as source_data
left outer join snapshotted_data
on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where snapshotted_data.dbt_unique_key is null
or (snapshotted_data.dbt_unique_key is not null and (TRUE or snapshotted_data.is_deleted = 'True'))
),
updates as (
select
'update' as dbt_change_type,
source_data.*,
snapshotted_data.episode_id,
snapshotted_data.is_deleted
from updates_source_data as source_data
join snapshotted_data
on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where (TRUE or snapshotted_data.is_deleted = 'True')
),
deletes as (
select
'delete' as dbt_change_type,
source_data.*,
current_timestamp() as valid_from_timestamp,
current_timestamp() as _meta_insertion_timestamp,
current_timestamp() as valid_to_timestamp,
snapshotted_data.episode_id,
snapshotted_data.is_deleted
from snapshotted_data
left join deletes_source_data as source_data
on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where source_data.dbt_unique_key is null
and not (snapshotted_data.is_deleted = 'True' and snapshotted_data.valid_to_timestamp is null)
),
deletion_records as (
select
'insert' as dbt_change_type,
snapshotted_data.`airtable_id`,
snapshotted_data.`change_id`,
NULL as `proposal_name`, -- PROBLEM: Tries to add this column but snapshotted_data doesn't have it
snapshotted_data.dbt_unique_key as dbt_unique_key,
current_timestamp() as valid_from_timestamp,
current_timestamp() as _meta_insertion_timestamp,
snapshotted_data.valid_to_timestamp as valid_to_timestamp,
to_hex(md5(concat(coalesce(cast(episode_id as string), ''), '|',coalesce(cast(current_timestamp() as string), '')))) as episode_id,
'True' as is_deleted
from snapshotted_data
left join deletes_source_data as source_data
on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
where source_data.dbt_unique_key is null
and not (snapshotted_data.is_deleted = 'True' and snapshotted_data.valid_to_timestamp is null)
)
-- LINE 238: UNION ALL fails here due to column type mismatch
select * from insertions
union all
select * from updates
union all
select * from deletes
union all
select * from deletion_records
);Environment
OS: Debian GNU/Linux 12
Python: 3.12.12
dbt-core: 1.10.13
Database adapter: dbt-bigquery 1.10.2Which database adapter are you using with dbt?
bigquery
Additional Context
No response