Skip to content

Conversation

@Ewan-Keith
Copy link

Description

Encountered a bug when:

  • "use_materialization_v2": False
  • config.contract.enforced: true
  • Multiple foreign keys are specified from one table to another (e.g. a fact table with multiple date values, each of which has a foreign key constrain to a common data dimension)
  • An initial full run of the model with the FK constraints is carried out, followed by an incremental run.

The bug is that on the initial full run the multiple FK constraints to the common table are created correctly on the Databricks table, but then when an incremental run is carried out these FKs are removed from the Databricks table. Single FKs to other tables seem to be retained (this isn't replicated in the test case in the interest of keeping it minimal).

The bug is replicated in the integration test TestIncrementalMultipleFKsToSameTableNoV2, before the fix is applied this passes the first assertion (after the initial full run) but fails the second assertion (after the incremental run) with an empty set of constraints returned. A matching test is also present replicating the logic when using materialization_v2 TestIncrementalMultipleFKsToSameTable which confirms the issue is not seen with v2 materialization (this passes without any actual code changes).

The fix I've landed on in dbt/include/databricks/macros/materializations/incremental/incremental.sql is to treat constraints as one of the configuration changes that is explicitly applied on an incremental run when changes are detected (the same as for tags, tblproperties, and liquid_clustering). This gets the failing test passing.

I don't think this is necessarily a root-cause fix, I think the root issue is likely a change in constraints being falsely detected in this instance, but I wasn't sure where to start on that, and this fix seems sensible (even if the root issue is resolved it seems likely we'd want changes to constraints to be applied without forcing a full-refresh where this is possible?). There's a unit test checking if there are any obvious bugs in the constraint diff checker but this doesn't turn anything up (passes as expected).

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

… table to another aren't retained on an incremental run

Signed-off-by: Ewan Keith <[email protected]>
Signed-off-by: Ewan Keith <[email protected]>
{% if liquid_clustering is not none %}
{% do apply_liquid_clustered_cols(target_relation, liquid_clustering) %}
{% endif %}
{% if constraints %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What i'm confused by is your description made it sound like constraints were being dropped, but we haven't removed or altered any logic that would cause them to drop. I'm digging in, but if you could provide more context, like what SQL is showing up in the logs, that would be appreciated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By any chance, is the table with the PK not incremental? If the PK table is not incremental, during create or replace it's PK will be destroyed and recreated, which causes the cascading delete of the FKs. If the PK table were incremental, then no PKs or FKs should have been deleted. So the fix here is basically saying, incremental tables, if your constraints don't match, whether due to deletion or due to the user changing it, reapply.

Copy link
Author

@Ewan-Keith Ewan-Keith Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's it! I was having trouble replicating the case so reduced our exact issue down in the integration test, thought it was the multiple keys that were the issue but total red herring, it's the inc -> non-inc like you say.

I've updated the integration tests to reflect this and replicated the behaviour (fails on v1 before we make the change to incremental.sql, passing after).

@benc-db benc-db requested a review from sd-db as a code owner January 8, 2026 21:48
@benc-db
Copy link
Collaborator

benc-db commented Jan 8, 2026

I'll run the E2E tests when there is less congestion on the resources. Thanks for raising this.

@benc-db
Copy link
Collaborator

benc-db commented Jan 9, 2026

@Ewan-Keith not sure if you can see this, but we have a failure in the E2E tests:

01:01:10  1 of 1 START sql incremental model test17679204186774943539_test_constraints.stg_numbers  [RUN]
01:01:10  [WARNING]: Use revamped materializations based on separating create and insert.  This allows more performant column comments, as well as new column features.
You may opt into the new behavior sooner by setting `flags.use_materialization_v2` to `True` in `dbt_project.yml`.
Visit https://docs.getdbt.com/reference/global-configs/behavior-changes for more information.
01:01:17  1 of 1 ERROR creating sql incremental model test17679204186774943539_test_constraints.stg_numbers  [ERROR in 6.88s]
01:01:17  
01:01:17  Finished running 1 incremental model in 0 hours 0 minutes and 12.18 seconds (12.18s).
01:01:17  
01:01:17  Completed with 1 error, 0 partial successes, and 0 warnings:
01:01:17  
01:01:17  Failure in model stg_numbers (models/stg_numbers.sql)
01:01:17    Database Error in model stg_numbers (models/stg_numbers.sql)
  
  [PARSE_SYNTAX_ERROR] Syntax error at or near '('. SQLSTATE: 42601 (line 2, pos 113)
  
  == SQL ==
  /* {"app": "dbt", "dbt_version": "1.11.2", "dbt_databricks_version": "1.11.3", "databricks_sql_connector_version": "4.1.3", "profile_name": "test", "target_name": "default", "node_id": "model.test.stg_numbers"} */
  ALTER TABLE `peco`.`test17679204186774943539_test_constraints`.`stg_numbers` ADD CONSTRAINT fk_n FOREIGN KEY (n) (n) REFERENCES test17679204186774943539_test_constraints.raw_numbers

In TestIncrementalForeignKeyExpressionConstraint::test_incremental_foreign_key_constraint

You also need to add annotations to your new tests to tell it to skip testing for "databricks_cluster" profile, since Hive doesn't support constraints.

@Ewan-Keith
Copy link
Author

@Ewan-Keith not sure if you can see this, but we have a failure in the E2E tests:

01:01:10  1 of 1 START sql incremental model test17679204186774943539_test_constraints.stg_numbers  [RUN]
01:01:10  [WARNING]: Use revamped materializations based on separating create and insert.  This allows more performant column comments, as well as new column features.
You may opt into the new behavior sooner by setting `flags.use_materialization_v2` to `True` in `dbt_project.yml`.
Visit https://docs.getdbt.com/reference/global-configs/behavior-changes for more information.
01:01:17  1 of 1 ERROR creating sql incremental model test17679204186774943539_test_constraints.stg_numbers  [ERROR in 6.88s]
01:01:17  
01:01:17  Finished running 1 incremental model in 0 hours 0 minutes and 12.18 seconds (12.18s).
01:01:17  
01:01:17  Completed with 1 error, 0 partial successes, and 0 warnings:
01:01:17  
01:01:17  Failure in model stg_numbers (models/stg_numbers.sql)
01:01:17    Database Error in model stg_numbers (models/stg_numbers.sql)
  
  [PARSE_SYNTAX_ERROR] Syntax error at or near '('. SQLSTATE: 42601 (line 2, pos 113)
  
  == SQL ==
  /* {"app": "dbt", "dbt_version": "1.11.2", "dbt_databricks_version": "1.11.3", "databricks_sql_connector_version": "4.1.3", "profile_name": "test", "target_name": "default", "node_id": "model.test.stg_numbers"} */
  ALTER TABLE `peco`.`test17679204186774943539_test_constraints`.`stg_numbers` ADD CONSTRAINT fk_n FOREIGN KEY (n) (n) REFERENCES test17679204186774943539_test_constraints.raw_numbers

In TestIncrementalForeignKeyExpressionConstraint::test_incremental_foreign_key_constraint

You also need to add annotations to your new tests to tell it to skip testing for "databricks_cluster" profile, since Hive doesn't support constraints.

Yeah I can see the failure, will try to find some time to understand why this change is causing the issue. Have added the test annotation, must've just gotten lost at some point in editing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants