- 
                Notifications
    You must be signed in to change notification settings 
- Fork 208
[Feature]: snowflake add hybrid table materialization #1400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Feature]: snowflake add hybrid table materialization #1400
Conversation
- Add hybrid_table materialization and supporting macros - Relation config + changeset; adapter describe method - Tests for basic, incremental, schema changes, constraints - Remove local-only docs from repo (PR will include usage/summary)
| Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR. CLA has not been signed by users: @Jeremy-Demlow | 
| I have signed it | 
| Hybrid Tables in dbt-snowflakeOverviewHybrid tables in Snowflake combine the benefits of transactional tables with the performance of analytical tables. They support: 
 This implementation follows the same patterns as dynamic tables in dbt-snowflake. Quick StartBasic Hybrid Table-- models/my_hybrid_table.sql
{{ config(
    materialized='hybrid_table',
    columns={
        'user_id': 'INTEGER',
        'username': 'VARCHAR(100)',
        'email': 'VARCHAR(255)',
        'created_at': 'TIMESTAMP_NTZ'
    },
    primary_key='user_id'
) }}
select * from {{ ref('source_users') }}Configuration OptionsRequired Configurations
 Optional Configurations
 ExamplesComposite Primary Key{{ config(
    materialized='hybrid_table',
    columns={
        'stream_id': 'VARCHAR(100)',
        'ad_campaign_id': 'VARCHAR(100)',
        'impressions': 'INTEGER',
        'watch_time': 'FLOAT'
    },
    primary_key=['stream_id', 'ad_campaign_id']
) }}
select * from {{ ref('aggregated_events') }}With Secondary Indexes{{ config(
    materialized='hybrid_table',
    columns={
        'order_id': 'INTEGER',
        'customer_id': 'INTEGER',
        'product_id': 'INTEGER',
        'order_date': 'DATE',
        'amount': 'DECIMAL(10,2)'
    },
    primary_key='order_id',
    indexes=[
        {'columns': ['customer_id']},
        {'columns': ['product_id']},
        {'name': 'idx_order_date', 'columns': ['order_date']}
    ]
) }}
select * from {{ ref('orders') }}With INCLUDE ColumnsSecondary indexes can include additional columns for covering index optimization: {{ config(
    materialized='hybrid_table',
    columns={
        'sensor_id': 'INTEGER',
        'timestamp': 'TIMESTAMP_NTZ',
        'temperature': 'DECIMAL(6,4)',
        'pressure': 'DECIMAL(6,4)'
    },
    primary_key='sensor_id',
    indexes=[
        {
            'name': 'idx_timestamp_covering',
            'columns': ['timestamp'],
            'include': ['temperature', 'pressure']
        }
    ]
) }}
select * from {{ ref('sensor_readings') }}With Constraints{{ config(
    materialized='hybrid_table',
    columns={
        'user_id': 'INTEGER',
        'email': 'VARCHAR(255)',
        'account_id': 'INTEGER',
        'status': 'VARCHAR(20)'
    },
    primary_key='user_id',
    unique_key='email',
    foreign_keys=[
        {
            'columns': ['account_id'],
            'parent_table': 'accounts',
            'parent_columns': ['account_id']
        }
    ]
) }}
select * from {{ ref('users') }}Incremental BehaviorHybrid tables support incremental updates using MERGE: 
 The MERGE uses the  Custom Merge ColumnsBy default, all non-primary-key columns are updated. You can specify which columns to update: {{ config(
    materialized='hybrid_table',
    columns={
        'id': 'INTEGER',
        'value': 'INTEGER',
        'updated_at': 'TIMESTAMP_NTZ',
        'created_at': 'TIMESTAMP_NTZ'
    },
    primary_key='id',
    merge_update_columns=['value', 'updated_at']  -- Don't update created_at
) }}
select * from {{ ref('source') }}Schema Change HandlingHybrid tables have limited ALTER support. Use  fail (default){{ config(
    materialized='hybrid_table',
    on_schema_change='fail',
    -- ... other config
) }}Raises an error if schema changes are detected. This is the safest option. apply{{ config(
    materialized='hybrid_table',
    on_schema_change='apply',
    -- ... other config
) }}Performs a full refresh (DROP + CREATE) when schema changes are detected. continue{{ config(
    materialized='hybrid_table',
    on_schema_change='continue',
    -- ... other config
) }}Logs a warning but continues with incremental MERGE, ignoring schema changes. Full RefreshForce a full refresh using the  dbt run --select my_hybrid_table --full-refreshThis will DROP and recreate the table. Performance Considerations
 LimitationsPer Snowflake's hybrid table limitations: 
 Resources | 
| Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR. CLA has not been signed by users: @Jeremy-Demlow | 
Summary
Resolves
Problem
Snowflake Hybrid Tables are not currently supported as a first-class materialization in dbt-snowflake. Users must hand-roll DDL/DML for CTAS, constraints, indexes, and incremental upserts (MERGE), which is error-prone and inconsistent across projects.
Solution
Implement a new hybrid_table materialization that:
Implementation
src/dbt/include/snowflake/macros/materializations/hybrid_table.sqlsrc/dbt/include/snowflake/macros/relations/hybrid_table/create.sqlsrc/dbt/include/snowflake/macros/relations/hybrid_table/merge.sqlsrc/dbt/include/snowflake/macros/relations/hybrid_table/replace.sqlsrc/dbt/include/snowflake/macros/relations/hybrid_table/drop.sqlsrc/dbt/include/snowflake/macros/relations/hybrid_table/describe.sqlsrc/dbt/adapters/snowflake/relation_configs/hybrid_table.py(config dataclass + changeset)src/dbt/adapters/snowflake/impl.py(describe_hybrid_table)src/dbt/adapters/snowflake/relation.py(register HybridTable, config changeset)IS_HYBRIDso relation type is correctly reported.merge_update_columns.dbt run --full-refresh.Tests (15/15 passing)
Functional tests under
tests/functional/relation_tests/hybrid_table_tests/:Notes:
IS_HYBRIDBackward Compatibility
materialized: hybrid_table.Performance
merge_update_columns.Docs
hybrid_tablematerialization reference for dbt-snowflakeChecklist
Additional Notes
DROP TABLEfor replacement (Snowflake doesn’t use DROP HYBRID TABLE).Branch/PR
References