feat: Enable concurrent microbatch execution#1326
feat: Enable concurrent microbatch execution#1326wmjones wants to merge 2 commits intodatabricks:mainfrom
Conversation
3d91722 to
060f1b9
Compare
|
Hi @wmjones, we are doing a the release today for P.S. I am stating integration test runs and will report if I see any issues. |
…y capability Signed-off-by: Wyatt Jones <wyatt.jones6@cfacorp.com>
060f1b9 to
04e0846
Compare
|
@wmjones It seem concurrency is opt-in by default. What this means is that anyone who is using |
Resolves #914
Description
Declares
MicrobatchConcurrencyadapter capability so dbt-core 1.9+ can execute microbatch incremental batches in parallel threads instead of sequentially.This is a one-line addition to
DatabricksAdapter._capabilities. dbt-core already handles all concurrency orchestration — temp table uniqueness (viamodel.batch.idsuffix inmake_temp_relation), non-overlappingREPLACE WHEREpredicates per batch, and sequential execution of first/last batches. The adapter just needs to opt in.Why
Support.Full: Concurrency control happens at dbt-core's Python threading level, not Databricks SQL level. Every DBR version and SQL Warehouse that supportsreplace_whereinherently supports concurrent batch execution.OPTIMIZE note: Per-batch
OPTIMIZEcalls can conflict under concurrency whenliquid_clustered_by,zorder, orauto_liquid_clusteris configured. Users with those configs can setDATABRICKS_SKIP_OPTIMIZE=trueas a workaround. Plain Delta tables are unaffected.Tested on: SQL Warehouse with Unity Catalog (30-batch microbatch model,
--threads 4, all batches passed, concurrent execution confirmed, zero duplicates).Prior art: dbt-snowflake added the same capability in dbt-snowflake#1259.
Checklist
CHANGELOG.mdand added information about my change to the "dbt-databricks next" section.