@@ -9,7 +9,7 @@ This guide helps AI agents quickly understand and work productively with the dbt
99- ** What** : dbt adapter for Databricks Lakehouse platform
1010- ** Based on** : dbt-spark adapter with Databricks-specific enhancements
1111- ** Key Features** : Unity Catalog support, Delta Lake, Python models, streaming tables
12- - ** Language** : Python 3.9 + with Jinja2 SQL macros
12+ - ** Language** : Python 3.10 + with Jinja2 SQL macros
1313- ** Architecture** : Inherits from Spark adapter, extends with Databricks-specific functionality
1414
1515### Essential Files to Understand
@@ -20,6 +20,7 @@ dbt/adapters/databricks/
2020├── connections.py # Connection management and SQL execution
2121├── credentials.py # Authentication (token, OAuth, Azure AD)
2222├── relation.py # Databricks-specific relation handling
23+ ├── dbr_capabilities.py # DBR version capability system
2324├── python_models/ # Python model execution on clusters
2425├── relation_configs/ # Table/view configuration management
2526└── catalogs/ # Unity Catalog vs Hive Metastore logic
@@ -33,24 +34,37 @@ dbt/include/databricks/macros/ # Jinja2 SQL templates
3334
3435## 🛠 Development Environment
3536
36- ** Prerequisites** : Python 3.9 + installed on your system
37+ ** Prerequisites** : Python 3.10 + installed on your system
3738
3839** Install Hatch** (recommended):
3940
41+ For Linux:
42+
4043``` bash
41- # Install Hatch globally - see https://hatch.pypa.io/dev/install/
42- pip install hatch
44+ # Download and install standalone binary
45+ curl -Lo hatch.tar.gz https://github.com/pypa/hatch/releases/latest/download/hatch-x86_64-unknown-linux-gnu.tar.gz
46+ tar -xzf hatch.tar.gz
47+ mkdir -p $HOME /bin
48+ mv hatch $HOME /bin/hatch
49+ chmod +x $HOME /bin/hatch
50+ echo ' export PATH="$HOME/bin:$PATH"' >> ~ /.zshrc
51+ export PATH=" $HOME /bin:$PATH "
4352
4453# Create default environment (Hatch installs needed Python versions)
4554hatch env create
4655```
4756
57+ For other platforms: see https://hatch.pypa.io/latest/install/
58+
4859** Essential commands** :
4960
5061``` bash
5162hatch run code-quality # Format, lint, type-check
5263hatch run unit # Run unit tests
5364hatch run cluster-e2e # Run functional tests
65+
66+ # For specific tests, use pytest directly:
67+ hatch run pytest path/to/test_file.py::TestClass::test_method -v
5468```
5569
5670> 📖 ** See [ Development Guide] ( docs/dbt-databricks-dev.md ) ** for comprehensive setup documentation
@@ -113,17 +127,38 @@ class TestCreateTable(MacroTestBase):
113127
114128#### Functional Test Example
115129
130+ ** Important** : SQL models and YAML schemas should be defined in a ` fixtures.py ` file in the same directory as the test, not inline in the test class. This keeps tests clean and fixtures reusable.
131+
132+ ** fixtures.py:**
133+
134+ ``` python
135+ my_model_sql = """
136+ {{ config(materialized='incremental', unique_key='id') }}
137+ select 1 as id, 'test' as name
138+ """
139+
140+ my_schema_yml = """
141+ version: 2
142+ models:
143+ - name: my_model
144+ columns:
145+ - name: id
146+ description: 'ID column'
147+ """
148+ ```
149+
150+ ** test_my_feature.py:**
151+
116152``` python
117153from dbt.tests import util
154+ from tests.functional.adapter.my_feature import fixtures
118155
119156class TestIncrementalModel :
120157 @pytest.fixture (scope = " class" )
121158 def models (self ):
122159 return {
123- " my_model.sql" : """
124- {{ config(materialized='incremental', unique_key='id') }}
125- select 1 as id, 'test' as name
126- """
160+ " my_model.sql" : fixtures.my_model_sql,
161+ " schema.yml" : fixtures.my_schema_yml,
127162 }
128163
129164 def test_incremental_run (self , project ):
@@ -147,6 +182,46 @@ DatabricksAdapter (impl.py)
147182
148183### Key Components
149184
185+ #### DBR Capability System (` dbr_capabilities.py ` )
186+
187+ - ** Purpose** : Centralized management of DBR version-dependent features
188+ - ** Key Features** :
189+ - Per-compute caching (different clusters can have different capabilities)
190+ - Named capabilities instead of magic version numbers
191+ - Automatic detection of DBR version and SQL warehouse environments
192+ - ** Supported Capabilities** :
193+ - ` TIMESTAMPDIFF ` (DBR 10.4+): Advanced date/time functions
194+ - ` INSERT_BY_NAME ` (DBR 12.2+): Name-based column matching in INSERT
195+ - ` ICEBERG ` (DBR 14.3+): Apache Iceberg table format
196+ - ` COMMENT_ON_COLUMN ` (DBR 16.1+): Modern column comment syntax
197+ - ` JSON_COLUMN_METADATA ` (DBR 16.2+): Efficient metadata retrieval
198+ - ** Usage in Code** :
199+
200+ ``` python
201+ # In Python code
202+ if adapter.has_capability(DBRCapability.ICEBERG ):
203+ # Use Iceberg features
204+
205+ # In Jinja macros
206+ {% if adapter.has_dbr_capability(' comment_on_column' ) % }
207+ COMMENT ON COLUMN ...
208+ {% else % }
209+ ALTER TABLE ... ALTER COLUMN ...
210+ {% endif % }
211+
212+ {% if adapter.has_dbr_capability(' insert_by_name' ) % }
213+ INSERT INTO table BY NAME SELECT ...
214+ {% else % }
215+ INSERT INTO table SELECT ... -- positional
216+ {% endif % }
217+ ```
218+
219+ - ** Adding New Capabilities** :
220+ 1 . Add to ` DBRCapability ` enum
221+ 2 . Add ` CapabilitySpec ` with version requirements
222+ 3 . Use ` has_capability() ` or ` require_capability() ` in code
223+ - ** Important** : Each compute resource (identified by ` http_path ` ) maintains its own capability cache
224+
150225#### Connection Management (` connections.py ` )
151226
152227- Extends Spark connection manager for Databricks
@@ -184,6 +259,42 @@ DatabricksAdapter (impl.py)
184259- Override Spark macros with Databricks-specific logic
185260- Handle materializations (table, view, incremental, snapshot)
186261- Implement Databricks features (liquid clustering, column masks, tags)
262+ - ** Important** : To override a ` spark__macro_name ` macro, create ` databricks__macro_name ` (NOT ` spark__macro_name ` )
263+
264+ #### Multi-Statement SQL Execution
265+
266+ When a macro needs to execute multiple SQL statements (e.g., DELETE followed by INSERT), use the ` execute_multiple_statements ` helper:
267+
268+ ** Pattern for Multi-Statement Strategies:**
269+ ``` jinja
270+ {% macro my_multi_statement_strategy(args) %}
271+ {%- set statements = [] -%}
272+
273+ {#-- Build first statement --#}
274+ {%- set statement1 -%}
275+ DELETE FROM {{ target_relation }}
276+ WHERE some_condition
277+ {%- endset -%}
278+ {%- do statements.append(statement1) -%}
279+
280+ {#-- Build second statement --#}
281+ {%- set statement2 -%}
282+ INSERT INTO {{ target_relation }}
283+ SELECT * FROM {{ source_relation }}
284+ {%- endset -%}
285+ {%- do statements.append(statement2) -%}
286+
287+ {{- return(statements) -}}
288+ {% endmacro %}
289+ ```
290+
291+ ** How It Works:**
292+ - Return a ** list of SQL strings** from your strategy macro
293+ - The incremental materialization automatically detects lists and calls ` execute_multiple_statements() `
294+ - Each statement executes separately via ` {% call statement('main') %} `
295+ - Used by: ` delete+insert ` incremental strategy (DBR < 17.1 fallback), materialized views, streaming tables
296+
297+ ** Note:** Databricks SQL connector does NOT support semicolon-separated statements in a single execute call. Always return a list.
187298
188299### Configuration System
189300
@@ -256,6 +367,7 @@ Models can be configured with Databricks-specific options:
256367
257368- ** Development** : ` docs/dbt-databricks-dev.md ` - Setup and workflow
258369- ** Testing** : ` docs/testing.md ` - Comprehensive testing guide
370+ - ** DBR Capabilities** : ` docs/dbr-capability-system.md ` - Version-dependent features
259371- ** Contributing** : ` CONTRIBUTING.MD ` - Code standards and PR process
260372- ** User Docs** : [ docs.getdbt.com] ( https://docs.getdbt.com/reference/resource-configs/databricks-configs )
261373
@@ -273,6 +385,11 @@ Models can be configured with Databricks-specific options:
2733853 . ** SQL Generation** : Prefer macros over Python string manipulation
2743864 . ** Testing** : Write both unit and functional tests for new features
2753875 . ** Configuration** : Use dataclasses with validation for new config options
388+ 6 . ** Imports** : Always import at the top of the file, never use local imports within functions or methods
389+ 7 . ** Version Checks** : Use capability system instead of direct version comparisons:
390+ - ❌ ` if adapter.compare_dbr_version(16, 1) >= 0: `
391+ - ✅ ` if adapter.has_capability(DBRCapability.COMMENT_ON_COLUMN): `
392+ - ✅ ` {% if adapter.has_dbr_capability('comment_on_column') %} `
276393
277394## 🚨 Common Pitfalls for Agents
278395
@@ -284,12 +401,21 @@ Models can be configured with Databricks-specific options:
2844016 . ** Follow SQL normalization** in test assertions with ` assert_sql_equal() `
2854027 . ** Handle Unity Catalog vs HMS differences** in feature implementations
2864038 . ** Consider backward compatibility** when modifying existing behavior
404+ 9 . ** Use capability system for version checks** - Never add new ` compare_dbr_version() ` calls
405+ 10 . ** Remember per-compute caching** - Different clusters may have different capabilities in the same run
406+ 11 . ** Multi-statement SQL** : Don't use semicolons to separate statements - return a list instead and let ` execute_multiple_statements() ` handle it
287407
288408## 🎯 Success Metrics
289409
290410When working on this codebase, ensure:
291411
292412- [ ] All tests pass (` hatch run code-quality && hatch run unit ` )
413+ - [ ] ** CRITICAL: Run affected functional tests before declaring success**
414+ - If you modified connection/capability logic: Run tests that use multiple computes or check capabilities
415+ - If you modified incremental materializations: Run ` tests/functional/adapter/incremental/ `
416+ - If you modified Python models: Run ` tests/functional/adapter/python_model/ `
417+ - If you modified macros: Run tests that use those macros
418+ - ** NEVER declare "mission accomplished" without running functional tests for affected features**
293419- [ ] New features have both unit and functional tests
294420- [ ] SQL generation follows Databricks best practices
295421- [ ] Changes maintain backward compatibility
0 commit comments