[Feature]: snowflake add hybrid table materialization #1400

Jeremy-Demlow · 2025-10-16T21:09:13Z

Summary

Add hybrid_table materialization and supporting macros
Relation config + changeset; adapter describe method
Tests for basic, incremental, schema changes, constraints
Remove local-only docs from repo (PR will include usage/summary)

Resolves

Resolves #
Docs: open a docs issue at https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose

Problem
Snowflake Hybrid Tables are not currently supported as a first-class materialization in dbt-snowflake. Users must hand-roll DDL/DML for CTAS, constraints, indexes, and incremental upserts (MERGE), which is error-prone and inconsistent across projects.

Solution
Implement a new hybrid_table materialization that:

Creates hybrid tables with CREATE HYBRID TABLE … AS SELECT (CTAS) using explicit schema (columns, PRIMARY KEY, optional UNIQUE/FOREIGN KEY, secondary indexes).
Supports incremental upsert via MERGE using the configured primary key.
Detects/handles schema changes with on_schema_change = fail | apply | continue (apply → full refresh).
Applies grants, persists docs, and integrates with existing dbt-snowflake patterns.

Implementation

Materialization and macros
- src/dbt/include/snowflake/macros/materializations/hybrid_table.sql
- src/dbt/include/snowflake/macros/relations/hybrid_table/create.sql
- src/dbt/include/snowflake/macros/relations/hybrid_table/merge.sql
- src/dbt/include/snowflake/macros/relations/hybrid_table/replace.sql
- src/dbt/include/snowflake/macros/relations/hybrid_table/drop.sql
- src/dbt/include/snowflake/macros/relations/hybrid_table/describe.sql
Relation config and adapter integration
- src/dbt/adapters/snowflake/relation_configs/hybrid_table.py (config dataclass + changeset)
- src/dbt/adapters/snowflake/impl.py (describe_hybrid_table)
- src/dbt/adapters/snowflake/relation.py (register HybridTable, config changeset)
Behavior details and guardrails
- CTAS projects columns in the configured order to prevent type mismatches in CTAS.
- Information schema detection uses IS_HYBRID so relation type is correctly reported.
- on_schema_change:
  - fail (default): raises when config/schema changes detected
  - apply: full-refresh path (drop + create)
  - continue: warn, skip applying change, proceed with MERGE
- MERGE updates all non-PK columns by default or a configured subset via merge_update_columns.
- Constraints supported at create-time: PRIMARY KEY (required), UNIQUE, FOREIGN KEY (enforced by Snowflake).
- Secondary indexes supported (with optional INCLUDE columns).
- Full refresh: dbt run --full-refresh.

Tests (15/15 passing)
Functional tests under tests/functional/relation_tests/hybrid_table_tests/:

Basic: creation, full refresh, composite PK, indexes, unique constraint
Incremental: MERGE-upsert updates existing rows and inserts new ones
Schema changes: fail/apply/continue behaviors
Constraints: PRIMARY KEY, UNIQUE, composite PK enforcement
Notes:
Relation type detection updated to use IS_HYBRID
CTAS column order enforced
Acceptance criteria validated against Snowflake’s documented behavior for hybrid tables

Backward Compatibility

No breaking changes; new materialization opt-in via materialized: hybrid_table.
No changes to existing table/view/incremental materializations.

Performance

Initial build uses CTAS and benefits from Snowflake’s optimized bulk loading for empty tables.
Incremental uses MERGE; users may fine-tune via merge_update_columns.
Secondary indexes supported for access-path optimization.

Docs

Open a docs issue at https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose to add:
- New hybrid_table materialization reference for dbt-snowflake
- Configuration examples (columns, primary_key, indexes, include columns, foreign_keys, on_schema_change, merge_update_columns)
- Known limitations (constraints at create time; most schema/index changes require rebuild)

Checklist

I have read the contributing guide and understand what’s expected
I have run this code in development, and tests pass locally
This PR includes tests
This PR has no breaking interface changes

Additional Notes

Hybrid table creation uses DROP TABLE for replacement (Snowflake doesn’t use DROP HYBRID TABLE).
Constraint error assertions in tests accept Snowflake’s hybrid error variants (e.g., “A primary key already exists.”).
Limitations per Snowflake docs: constraints defined at create time; many schema/index changes require rebuild.

Branch/PR

Branch: feature/hybrid-table-materialization
Fork: Jeremy-Demlow/dbt-adapters
Open PR: https://github.com/Jeremy-Demlow/dbt-adapters/pull/new/feature/hybrid-table-materialization

References

Fork repository: https://github.com/Jeremy-Demlow/dbt-adapters

- Add hybrid_table materialization and supporting macros - Relation config + changeset; adapter describe method - Tests for basic, incremental, schema changes, constraints - Remove local-only docs from repo (PR will include usage/summary)

cla-bot · 2025-10-16T21:09:19Z

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @Jeremy-Demlow

Jeremy-Demlow · 2025-10-16T21:11:22Z

I have signed it

Jeremy-Demlow · 2025-10-16T21:16:08Z

Hybrid Tables in dbt-snowflake

Overview

Hybrid tables in Snowflake combine the benefits of transactional tables with the performance of analytical tables. They support:

Low-latency queries with row-based storage
ACID transactions with enforced constraints
Primary and secondary indexes for fast lookups
UPSERT patterns via incremental MERGE operations

This implementation follows the same patterns as dynamic tables in dbt-snowflake.

Quick Start

Basic Hybrid Table

-- models/my_hybrid_table.sql
{{ config(
    materialized='hybrid_table',
    columns={
        'user_id': 'INTEGER',
        'username': 'VARCHAR(100)',
        'email': 'VARCHAR(255)',
        'created_at': 'TIMESTAMP_NTZ'
    },
    primary_key='user_id'
) }}

select * from {{ ref('source_users') }}

Configuration Options

Required Configurations

materialized='hybrid_table': Specifies the materialization type
columns: Dictionary mapping column names to Snowflake data types
primary_key: Column(s) forming the primary key (can be string or list)

Optional Configurations

indexes: List of secondary index definitions
unique_key: Column(s) with UNIQUE constraint
foreign_keys: List of foreign key constraint definitions
on_schema_change: How to handle schema changes ('fail', 'apply', 'continue')
merge_update_columns: Specific columns to update during MERGE

Examples

Composite Primary Key

{{ config(
    materialized='hybrid_table',
    columns={
        'stream_id': 'VARCHAR(100)',
        'ad_campaign_id': 'VARCHAR(100)',
        'impressions': 'INTEGER',
        'watch_time': 'FLOAT'
    },
    primary_key=['stream_id', 'ad_campaign_id']
) }}

select * from {{ ref('aggregated_events') }}

With Secondary Indexes

{{ config(
    materialized='hybrid_table',
    columns={
        'order_id': 'INTEGER',
        'customer_id': 'INTEGER',
        'product_id': 'INTEGER',
        'order_date': 'DATE',
        'amount': 'DECIMAL(10,2)'
    },
    primary_key='order_id',
    indexes=[
        {'columns': ['customer_id']},
        {'columns': ['product_id']},
        {'name': 'idx_order_date', 'columns': ['order_date']}
    ]
) }}

select * from {{ ref('orders') }}

With INCLUDE Columns

Secondary indexes can include additional columns for covering index optimization:

{{ config(
    materialized='hybrid_table',
    columns={
        'sensor_id': 'INTEGER',
        'timestamp': 'TIMESTAMP_NTZ',
        'temperature': 'DECIMAL(6,4)',
        'pressure': 'DECIMAL(6,4)'
    },
    primary_key='sensor_id',
    indexes=[
        {
            'name': 'idx_timestamp_covering',
            'columns': ['timestamp'],
            'include': ['temperature', 'pressure']
        }
    ]
) }}

select * from {{ ref('sensor_readings') }}

With Constraints

{{ config(
    materialized='hybrid_table',
    columns={
        'user_id': 'INTEGER',
        'email': 'VARCHAR(255)',
        'account_id': 'INTEGER',
        'status': 'VARCHAR(20)'
    },
    primary_key='user_id',
    unique_key='email',
    foreign_keys=[
        {
            'columns': ['account_id'],
            'parent_table': 'accounts',
            'parent_columns': ['account_id']
        }
    ]
) }}

select * from {{ ref('users') }}

Incremental Behavior

Hybrid tables support incremental updates using MERGE:

First run: Creates table using CTAS with optimized bulk loading
Subsequent runs: Uses MERGE to UPDATE existing rows and INSERT new ones

The MERGE uses the primary_key to match rows.

Custom Merge Columns

By default, all non-primary-key columns are updated. You can specify which columns to update:

{{ config(
    materialized='hybrid_table',
    columns={
        'id': 'INTEGER',
        'value': 'INTEGER',
        'updated_at': 'TIMESTAMP_NTZ',
        'created_at': 'TIMESTAMP_NTZ'
    },
    primary_key='id',
    merge_update_columns=['value', 'updated_at']  -- Don't update created_at
) }}

select * from {{ ref('source') }}

Schema Change Handling

Hybrid tables have limited ALTER support. Use on_schema_change to control behavior:

fail (default)

{{ config(
    materialized='hybrid_table',
    on_schema_change='fail',
    -- ... other config
) }}

Raises an error if schema changes are detected. This is the safest option.

apply

{{ config(
    materialized='hybrid_table',
    on_schema_change='apply',
    -- ... other config
) }}

Performs a full refresh (DROP + CREATE) when schema changes are detected.

continue

{{ config(
    materialized='hybrid_table',
    on_schema_change='continue',
    -- ... other config
) }}

Logs a warning but continues with incremental MERGE, ignoring schema changes.

Full Refresh

Force a full refresh using the --full-refresh flag:

dbt run --select my_hybrid_table --full-refresh

This will DROP and recreate the table.

Performance Considerations

Bulk Loading: Initial CTAS uses Snowflake's optimized bulk loading (up to 10x faster)
MERGE Performance: MERGE operations may be slower than bulk loads. Consider batch sizes.
Index Strategy: Add indexes on columns used in WHERE clauses and JOIN conditions
INCLUDE Columns: Use INCLUDE for covering indexes to avoid table lookups

Limitations

Per Snowflake's hybrid table limitations:

Primary key is required
Constraints are enforced (unlike standard Snowflake tables)
Limited ALTER support (most changes require full refresh)
Cannot be shared across accounts
Some Snowflake features not supported (see Snowflake docs)

Resources

…ry keys

cla-bot · 2025-10-16T21:23:13Z

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @Jeremy-Demlow

Jeremy-Demlow added 2 commits October 16, 2025 13:49

feat(snowflake): hybrid table materialization

27244df

- Add hybrid_table materialization and supporting macros - Relation config + changeset; adapter describe method - Tests for basic, incremental, schema changes, constraints - Remove local-only docs from repo (PR will include usage/summary)

chore: add PR template for hybrid table materialization

4176de4

Jeremy-Demlow requested a review from a team as a code owner October 16, 2025 21:09

chore(snowflake): enrich describe_hybrid_table with columns and prima…

4c85061

…ry keys

Jeremy-Demlow changed the title ~~feat(snowflake): add hybrid table materialization~~ [Feature]: snowflake add hybrid table materialization Oct 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: snowflake add hybrid table materialization #1400

[Feature]: snowflake add hybrid table materialization #1400

Uh oh!

Jeremy-Demlow commented Oct 16, 2025

Uh oh!

cla-bot bot commented Oct 16, 2025

Uh oh!

Jeremy-Demlow commented Oct 16, 2025

Uh oh!

Jeremy-Demlow commented Oct 16, 2025

Uh oh!

cla-bot bot commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Feature]: snowflake add hybrid table materialization #1400

Are you sure you want to change the base?

[Feature]: snowflake add hybrid table materialization #1400

Uh oh!

Conversation

Jeremy-Demlow commented Oct 16, 2025

Uh oh!

cla-bot bot commented Oct 16, 2025

Uh oh!

Jeremy-Demlow commented Oct 16, 2025

Uh oh!

Jeremy-Demlow commented Oct 16, 2025

Hybrid Tables in dbt-snowflake

Overview

Quick Start

Basic Hybrid Table

Configuration Options

Required Configurations

Optional Configurations

Examples

Composite Primary Key

With Secondary Indexes

With INCLUDE Columns

With Constraints

Incremental Behavior

Custom Merge Columns

Schema Change Handling

fail (default)

apply

continue

Full Refresh

Performance Considerations

Limitations

Resources

Uh oh!

cla-bot bot commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant