-
Notifications
You must be signed in to change notification settings - Fork 180
Fix: Retain multiple foreign-keys from one table to another after an incremental run #1296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix: Retain multiple foreign-keys from one table to another after an incremental run #1296
Conversation
… table to another aren't retained on an incremental run Signed-off-by: Ewan Keith <[email protected]>
Signed-off-by: Ewan Keith <[email protected]>
| {% if liquid_clustering is not none %} | ||
| {% do apply_liquid_clustered_cols(target_relation, liquid_clustering) %} | ||
| {% endif %} | ||
| {% if constraints %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What i'm confused by is your description made it sound like constraints were being dropped, but we haven't removed or altered any logic that would cause them to drop. I'm digging in, but if you could provide more context, like what SQL is showing up in the logs, that would be appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By any chance, is the table with the PK not incremental? If the PK table is not incremental, during create or replace it's PK will be destroyed and recreated, which causes the cascading delete of the FKs. If the PK table were incremental, then no PKs or FKs should have been deleted. So the fix here is basically saying, incremental tables, if your constraints don't match, whether due to deletion or due to the user changing it, reapply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's it! I was having trouble replicating the case so reduced our exact issue down in the integration test, thought it was the multiple keys that were the issue but total red herring, it's the inc -> non-inc like you say.
I've updated the integration tests to reflect this and replicated the behaviour (fails on v1 before we make the change to incremental.sql, passing after).
|
I'll run the E2E tests when there is less congestion on the resources. Thanks for raising this. |
|
@Ewan-Keith not sure if you can see this, but we have a failure in the E2E tests: In TestIncrementalForeignKeyExpressionConstraint::test_incremental_foreign_key_constraint You also need to add annotations to your new tests to tell it to skip testing for "databricks_cluster" profile, since Hive doesn't support constraints. |
Yeah I can see the failure, will try to find some time to understand why this change is causing the issue. Have added the test annotation, must've just gotten lost at some point in editing. |
Description
Encountered a bug when:
"use_materialization_v2": Falseconfig.contract.enforced: trueThe bug is that on the initial full run the multiple FK constraints to the common table are created correctly on the Databricks table, but then when an incremental run is carried out these FKs are removed from the Databricks table. Single FKs to other tables seem to be retained (this isn't replicated in the test case in the interest of keeping it minimal).
The bug is replicated in the integration test TestIncrementalMultipleFKsToSameTableNoV2, before the fix is applied this passes the first assertion (after the initial full run) but fails the second assertion (after the incremental run) with an empty set of constraints returned. A matching test is also present replicating the logic when using materialization_v2 TestIncrementalMultipleFKsToSameTable which confirms the issue is not seen with v2 materialization (this passes without any actual code changes).
The fix I've landed on in
dbt/include/databricks/macros/materializations/incremental/incremental.sqlis to treat constraints as one of the configuration changes that is explicitly applied on an incremental run when changes are detected (the same as fortags,tblproperties, andliquid_clustering). This gets the failing test passing.I don't think this is necessarily a root-cause fix, I think the root issue is likely a change in constraints being falsely detected in this instance, but I wasn't sure where to start on that, and this fix seems sensible (even if the root issue is resolved it seems likely we'd want changes to constraints to be applied without forcing a full-refresh where this is possible?). There's a unit test checking if there are any obvious bugs in the constraint diff checker but this doesn't turn anything up (passes as expected).
Checklist
CHANGELOG.mdand added information about my change to the "dbt-databricks next" section.