Fix dbt clone for Snowflake Iceberg tables#1768
Fix dbt clone for Snowflake Iceberg tables#1768jecolvin wants to merge 4 commits intodbt-labs:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the Snowflake adapter’s clone materialization to correctly clone Iceberg source tables by generating Iceberg-specific DDL, and adds a functional test to validate the behavior.
Changes:
- Detect Iceberg source relations during
dbt cloneand emitCREATE OR REPLACE ICEBERG TABLE ... CLONE ...when appropriate. - Ensure
transientis not emitted for Iceberg clones (since Iceberg tables cannot be transient). - Add a functional test that clones an Iceberg table and asserts the cloned relation is also Iceberg.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| dbt-snowflake/src/dbt/include/snowflake/macros/materializations/clone.sql | Conditionally renders Iceberg vs transient table clone DDL based on cached source relation metadata. |
| dbt-snowflake/tests/functional/adapter/dbt_clone/test_dbt_clone.py | Adds a functional test case covering cloning an Iceberg table and validating the cloned relation type. |
| dbt-snowflake/.changes/unreleased/Fixes-20260317-100403.yaml | Adds a changelog entry documenting the Iceberg clone fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
dbt-snowflake/tests/functional/adapter/dbt_clone/test_dbt_clone.py
Outdated
Show resolved
Hide resolved
a98d41d to
c96af09
Compare
|
@colin-rogers-dbt Pushed a fix for the failing integration test! |
There was a problem hiding this comment.
Pull request overview
Updates Snowflake’s dbt clone behavior to support cloning from Iceberg source tables by generating Iceberg-specific DDL and adding a functional regression test.
Changes:
- Detect Iceberg source tables during clone and emit
CREATE OR REPLACE ICEBERG TABLE ... CLONE ...when appropriate. - Omit
transientfor Iceberg clones (since Iceberg tables cannot be transient). - Add a functional test validating that an Iceberg table can be cloned and remains Iceberg-format.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
dbt-snowflake/src/dbt/include/snowflake/macros/materializations/clone.sql |
Adds Iceberg detection and conditional DDL generation for clone operations. |
dbt-snowflake/tests/functional/adapter/dbt_clone/test_dbt_clone.py |
Adds a functional test for cloning an Iceberg table and verifying the cloned relation format. |
dbt-snowflake/.changes/unreleased/Fixes-20260317-100403.yaml |
Adds an unreleased changelog entry documenting the fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
dbt-snowflake/src/dbt/include/snowflake/macros/materializations/clone.sql
Outdated
Show resolved
Hide resolved
dbt-snowflake/tests/functional/adapter/dbt_clone/test_dbt_clone.py
Outdated
Show resolved
Hide resolved
65c84ad to
d048a5c
Compare
|
@colin-rogers-dbt I've addressed the latest round of feedback! |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| {#- Cache miss: query Snowflake directly -#} | ||
| {%- set show_sql -%} | ||
| show tables like '{{ relation.identifier }}' in schema {{ relation.database }}.{{ relation.schema }} | ||
| {%- endset -%} | ||
| {%- set results = run_query(show_sql) -%} | ||
| {%- for row in results -%} | ||
| {%- if row['name'] == relation.identifier -%} | ||
| {{ return(row['is_iceberg'] in ('Y', 'YES')) }} | ||
| {%- endif -%} | ||
| {%- endfor -%} |
There was a problem hiding this comment.
The cache-miss fallback builds SHOW TABLES SQL using raw relation.database/relation.schema and compares row['name'] to relation.identifier case-sensitively. This will mis-detect Iceberg tables when quoting policies or case-sensitive identifiers are used (Snowflake often returns name uppercased), which can cause the clone macro to emit the wrong DDL and fail. Prefer using adapter.get_relation(database=..., schema=..., identifier=...) (or relation.include(identifier=False) / IN SCHEMA IDENTIFIER(...)) and compare names case-insensitively; also escape quotes in the LIKE literal.
| {#- Cache miss: query Snowflake directly -#} | |
| {%- set show_sql -%} | |
| show tables like '{{ relation.identifier }}' in schema {{ relation.database }}.{{ relation.schema }} | |
| {%- endset -%} | |
| {%- set results = run_query(show_sql) -%} | |
| {%- for row in results -%} | |
| {%- if row['name'] == relation.identifier -%} | |
| {{ return(row['is_iceberg'] in ('Y', 'YES')) }} | |
| {%- endif -%} | |
| {%- endfor -%} | |
| {#- Cache miss: query Snowflake directly via adapter.get_relation -#} | |
| {%- set source_relation = adapter.get_relation( | |
| database=relation.database, | |
| schema=relation.schema, | |
| identifier=relation.identifier | |
| ) -%} | |
| {%- if source_relation is not none -%} | |
| {{ return(source_relation.is_iceberg_format) }} | |
| {%- endif -%} |
| state_path = os.path.join(project_root, "state") | ||
| if not os.path.exists(state_path): | ||
| os.makedirs(state_path) | ||
| shutil.copyfile(f"{project_root}/target/manifest.json", f"{project_root}/state/manifest.json") |
There was a problem hiding this comment.
copy_state builds paths with hardcoded "{project_root}/..." instead of os.path.join(...). This can break on Windows path separators and is inconsistent with the surrounding os.path.join usage; consider using os.path.join(project_root, "target", "manifest.json") / os.path.join(project_root, "state", "manifest.json").
| shutil.copyfile(f"{project_root}/target/manifest.json", f"{project_root}/state/manifest.json") | |
| src_manifest = os.path.join(project_root, "target", "manifest.json") | |
| dst_manifest = os.path.join(project_root, "state", "manifest.json") | |
| shutil.copyfile(src_manifest, dst_manifest) |
| Author: jcolvin | ||
| Issue: "1767" |
There was a problem hiding this comment.
The custom: block indentation is inconsistent with other .changes/unreleased/*.yaml entries in this repo (e.g., .changes/unreleased/Fixes-20260306-163035.yaml:4-6 uses two-space indentation). Align the indentation to match the established format to keep changelog tooling and diffs consistent.
| Author: jcolvin | |
| Issue: "1767" | |
| Author: jcolvin | |
| Issue: "1767" |
| {{ return(source_relation.is_iceberg_format) }} | ||
| {%- endif -%} | ||
|
|
||
| {#- Cache miss: query Snowflake directly -#} |
There was a problem hiding this comment.
I don't think we'll need this as the cache should already have the relation but if we do need this then we should be able to use adapter.get_relation instead of a custom show command
The clone macro now detects whether the source relation is an Iceberg table via load_cached_relation and emits CREATE ICEBERG TABLE ... CLONE instead of CREATE TABLE ... CLONE. The transient keyword is correctly omitted for Iceberg clones since Iceberg tables cannot be transient. Resolves dbt-labs#1767
load_cached_relation returns None during cross-schema clone operations because the source schema is not in the adapter cache. Fall back to querying Snowflake directly via SHOW TABLES to detect iceberg format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Harden SHOW TABLES fallback: add IN SCHEMA keyword and filter by exact name match to avoid LIKE wildcard false positives - Use SNOWFLAKE_TEST_ICEBERG_EXTERNAL_VOLUME env var for test portability - Extract copy_state/run_and_save_state to shared module-level functions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove SHOW TABLES fallback in snowflake__is_iceberg_table per reviewer feedback — cache should already have the relation - Use os.path.join consistently in copy_state helper Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8c7a082 to
428f6fe
Compare
Summary
load_cached_relationand emitCREATE OR REPLACE ICEBERG TABLE ... CLONE ...instead ofCREATE OR REPLACE [TRANSIENT] TABLE ... CLONE ...transientkeyword for Iceberg clones (Iceberg tables cannot be transient)Resolves #1767
Test plan
TestSnowflakeClonePossible,TestSnowflakeCloneTrainsentTable)TestSnowflakeCloneIcebergTabletest passes against a Snowflake account with an Iceberg-enabled external volume--log-level debugshowsCREATE OR REPLACE ICEBERG TABLEfor Iceberg sources andCREATE OR REPLACE TRANSIENT TABLEfor regular sources