Skip to content

Commit 783928f

Browse files
authored
Ele 4724 dbt fusion support (#825)
* remove annoying warning message about the materialization flag * initial changes to support dbt-fusion! * temp - use the CLI branch by default * force CLI ref * TO DROP: from fusion cli branch * avoid updating internal dbt field * use documented way to quote by config * agate_to_dicts: hacky way to handle decimals * create get_test_model macro instead of context["model"] usages * replace dbt.create_table_as with our own implementation * remove hack that's not working anyway * create_table_as - bugfix for postgres * strip ; from the end of queries so our comment won't cause an error in dbt-fusion * dbt-fusion fixes * test-warehouse: add dbt fusion * bugfix * fix attempt * use dbt-fusion for deps and debug * add exec shell after installing dbt fusion * another attempt to make dbtf work in CI * use direct dbt fusion path * write profiles before installing deps * use different profiles secret for fusion * change packages.yml in tests to point to an absolute path * dbt-fusion fixes * test * debug * test * oops * comment out packages edit in conftest * delete package lock in workflow * replace symlink instead * bugfix * handle_test_result - treat "error" as "fail" in dbt-fusion * re-comment out edit * bugfix + comment again * various fixes * handle_test_results - normalize test status * handle_tests_results - bugfix * use the dbt version instead of another flag in the CI workflows * groups bugfix * workflow bugfix * add -s to py.test * remove parallelism * bugfix * write_yaml: unlink on failure as well * source freshness test fix * more test fixes * test_disable_samples_config: fix * where parameter logic fix * add marker for skip_for_dbt_fusion * fix test_meta_tags_and_accepted_values * skip more tests on fusion * test_warehouse: dbt version in branch name fix * run tests with fusion cli branch for now * add fusion for other supported adapters * force using CLI branch * temporarily set pull_request * update supported fusion targets * bugfix to branch name * agate_to_dicts: remove unused and broken column_types reference * replace quoting logic with simple replace - following original PR's intent * pii bugfix * test-all-warehouses: don't run on all 1.8.0 on worker_dispatch * return pull_request_target * dbt invocations fix for fusion * test-warehouse - run against the canary dbt fusion version * redshift: use merge incremental strategy by default * create_table_as: bugfixes for BQ and Redshift * create_intermediate_relation - bugfix for fusion * test-all-warehouses: remove elementary ref * create_table_as: use default dbt implementation except for cases we know are problematic * handle_test_results: bugfix * get_model_relation_for_test: bugfix * agate_to_dicts: improve decimal serialization * remove canary build * normalize_data_type: do case-insensitive search * add implementation for edr_get_create_table_as_sql on databricks * workaround for bq seeds in dbt fusion * test-warehouse: pin on version 30 for now * fix test * bugfix to fix_seed_if_needed * get_user_creation_query: update not to use continue as it's not supported in fusion * remove fusion version limitation * fix var name * agate_to_dict -> dbt_object_to_dict * upload_dbt_models: fix checksum * remove package lock * ignore package-lock in the root folder * rename function * fix selector regex to also support = * test-warehouse: remove wrong awk use * bugfix * bugfix * bugfix - use correct flatten macro for test parent nodes * skip another exposures validity test in dbt fusion * bugfix
1 parent 86719d7 commit 783928f

File tree

74 files changed

+638
-278
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+638
-278
lines changed

.github/workflows/test-all-warehouses.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ jobs:
3737
matrix:
3838
dbt-version:
3939
${{ inputs.dbt-version && fromJSON(format('["{0}"]', inputs.dbt-version)) ||
40-
! contains(github.event_name, 'pull_request') && fromJSON('["1.8.0", "latest_official"]') ||
4140
fromJSON('["latest_official"]') }}
4241
warehouse-type:
4342
[
@@ -57,6 +56,14 @@ jobs:
5756
warehouse-type: postgres
5857
- dbt-version: "${{ inputs.dbt-version || 'latest_pre' }}"
5958
warehouse-type: postgres
59+
- dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
60+
warehouse-type: snowflake
61+
- dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
62+
warehouse-type: bigquery
63+
- dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
64+
warehouse-type: redshift
65+
- dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
66+
warehouse-type: databricks_catalog
6067
uses: ./.github/workflows/test-warehouse.yml
6168
with:
6269
warehouse-type: ${{ matrix.warehouse-type }}

.github/workflows/test-warehouse.yml

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -124,37 +124,44 @@ jobs:
124124
run: pip install databricks-sql-connector==2.9.3
125125

126126
- name: Install dbt
127+
if: ${{ inputs.dbt-version != 'fusion' }}
127128
run:
128129
pip install${{ (inputs.dbt-version == 'latest_pre' && ' --pre') || '' }}
129130
"dbt-core${{ (!startsWith(inputs.dbt-version, 'latest') && format('=={0}', inputs.dbt-version)) || '' }}"
130131
"dbt-${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || (inputs.warehouse-type == 'spark' && 'spark[PyHive]') || (inputs.warehouse-type == 'athena' && 'athena-community') || inputs.warehouse-type }}${{ (!startsWith(inputs.dbt-version, 'latest') && format('~={0}', inputs.dbt-version)) || '' }}"
131132

133+
- name: Install dbt-fusion
134+
if: inputs.dbt-version == 'fusion'
135+
run: |
136+
curl -fsSL https://public.cdn.getdbt.com/fs/install/install.sh | sh -s -- --update
137+
132138
- name: Install Elementary
133139
run: pip install "./elementary[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || inputs.warehouse-type }}]"
134140

135-
- name: Install dependencies
136-
working-directory: ${{ env.TESTS_DIR }}
137-
run: |
138-
dbt deps --project-dir dbt_project
139-
pip install -r requirements.txt
140-
141141
- name: Write dbt profiles
142142
env:
143-
PROFILES_YML: ${{ secrets.CI_PROFILES_YML }}
143+
PROFILES_YML: ${{ (inputs.dbt-version == 'fusion' && secrets.CI_PROFILES_YML_FUSION) || secrets.CI_PROFILES_YML }}
144144
run: |
145145
mkdir -p ~/.dbt
146-
DBT_VERSION=$(pip show dbt-core | grep -i version | awk '{print $2}' | sed 's/\.//g')
146+
DBT_VERSION=$(echo "${{ inputs.dbt-version }}" | sed 's/\.//g')
147147
UNDERSCORED_REF_NAME=$(echo "${{ inputs.warehouse-type }}_dbt_${DBT_VERSION}_${BRANCH_NAME}" | awk '{print tolower($0)}' | head -c 40 | sed "s/[-\/]/_/g")
148148
echo "$PROFILES_YML" | base64 -d | sed "s/<SCHEMA_NAME>/dbt_pkg_$UNDERSCORED_REF_NAME/g" > ~/.dbt/profiles.yml
149149
150+
- name: Install dependencies
151+
working-directory: ${{ env.TESTS_DIR }}
152+
run: |
153+
${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
154+
ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary
155+
pip install -r requirements.txt
156+
150157
- name: Check DWH connection
151158
working-directory: ${{ env.TESTS_DIR }}
152159
run: |
153-
dbt debug -t "${{ inputs.warehouse-type }}"
160+
${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
154161
155162
- name: Test
156163
working-directory: "${{ env.TESTS_DIR }}/tests"
157-
run: py.test -n8 -vvv --target "${{ inputs.warehouse-type }}" --junit-xml=test-results.xml --html=detailed_report_${{ inputs.warehouse-type }}_dbt_${{ inputs.dbt-version }}.html --self-contained-html --clear-on-end
164+
run: py.test -n8 -vvv --target "${{ inputs.warehouse-type }}" --junit-xml=test-results.xml --html=detailed_report_${{ inputs.warehouse-type }}_dbt_${{ inputs.dbt-version }}.html --self-contained-html --clear-on-end ${{ (inputs.dbt-version == 'fusion' && '--runner-method fusion') || '' }}
158165

159166
- name: Upload test results
160167
if: always()

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,6 @@ __pycache__/
2525

2626
# vscode
2727
.vscode/
28+
dbt_internal_packages/
29+
30+
/package-lock.yml
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
data
22
models/tmp
3+
dbt_internal_packages/

integration_tests/dbt_project/dbt_project.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ clean-targets: # directories to be removed by `dbt clean`
1818

1919
vars:
2020
debug_logs: "{{ env_var('DBT_EDR_DEBUG', False) }}"
21-
mute_ensure_materialization_override: true
2221

2322
models:
2423
elementary_tests:

integration_tests/dbt_project/macros/create_all_types_table.sql

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,7 @@
3131
CURRENT_TIME() as time_col,
3232
CURRENT_TIMESTAMP() as timestamp_col,
3333
{% endset %}
34-
{% set create_table_query = dbt.create_table_as(false, relation, sql_query) %}
35-
{% do elementary.edr_log(create_table_query) %}
36-
{% do elementary.run_query(create_table_query) %}
34+
{% do elementary.edr_create_table_as(false, relation, sql_query) %}
3735
{% endmacro %}
3836

3937
{% macro snowflake__create_all_types_table() %}
@@ -81,9 +79,7 @@
8179
[1,2,3] as array_col,
8280
TO_GEOGRAPHY('POINT(-122.35 37.55)') as geography_col
8381
{% endset %}
84-
{% set create_table_query = dbt.create_table_as(false, relation, sql_query) %}
85-
{% do elementary.edr_log(create_table_query) %}
86-
{% do elementary.run_query(create_table_query) %}
82+
{% do elementary.edr_create_table_as(false, relation, sql_query) %}
8783
{% endmacro %}
8884

8985
{% macro redshift__create_all_types_table() %}
@@ -123,9 +119,7 @@
123119
ST_GeogFromText('SRID=4324;POLYGON((0 0,0 1,1 1,10 10,1 0,0 0))') as geography_col,
124120
JSON_PARSE('{"data_type": "super"}') as super_col
125121
{% endset %}
126-
{% set create_table_query = dbt.create_table_as(false, relation, sql_query) %}
127-
{% do elementary.edr_log(create_table_query) %}
128-
{% do elementary.run_query(create_table_query) %}
122+
{% do elementary.edr_create_table_as(false, relation, sql_query) %}
129123

130124
{% endmacro %}
131125

@@ -184,9 +178,7 @@
184178
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::uuid as uuid_col,
185179
xmlcomment('text') as xml_col
186180
{% endset %}
187-
{% set create_table_query = dbt.create_table_as(false, relation, sql_query) %}
188-
{% do elementary.edr_log(create_table_query) %}
189-
{% do elementary.run_query(create_table_query) %}
181+
{% do elementary.edr_create_table_as(false, relation, sql_query) %}
190182
{% endmacro %}
191183

192184
{% macro default__create_all_types_table() %}
Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
{% materialization test, default %}
22
{% if var('enable_elementary_test_materialization', false) %}
3-
{% do return(elementary.materialization_test_default.call_macro()) %}
3+
{% do return(elementary.materialization_test_default()) %}
44
{% else %}
5-
{% do return(dbt.materialization_test_default.call_macro()) %}
5+
{% do return(dbt.materialization_test_default()) %}
66
{% endif %}
77
{% endmaterialization %}
88

99
{% materialization test, adapter="snowflake" %}
1010
{% if var('enable_elementary_test_materialization', false) %}
11-
{% do return(elementary.materialization_test_snowflake.call_macro()) %}
11+
{% do return(elementary.materialization_test_snowflake()) %}
1212
{% else %}
1313
{% if dbt.materialization_test_snowflake %}
14-
{% do return(dbt.materialization_test_snowflake.call_macro()) %}
14+
{% do return(dbt.materialization_test_snowflake()) %}
1515
{% else %}
16-
{% do return(dbt.materialization_test_default.call_macro()) %}
16+
{% do return(dbt.materialization_test_default()) %}
1717
{% endif %}
1818
{% endif %}
1919
{% endmaterialization %}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{% macro replace_empty_strings_with_nulls(table_name) %}
2+
{% set relation = ref(table_name) %}
3+
{% set columns = adapter.get_columns_in_relation(relation) %}
4+
5+
{% for col in columns %}
6+
{% set data_type = elementary.get_column_data_type(col) %}
7+
{% set normalized_data_type = elementary.normalize_data_type(data_type) %}
8+
9+
{% if normalized_data_type == "string" %}
10+
{% set update_query %}
11+
update {{ relation }}
12+
set {{ col["name"] }} = NULL
13+
where {{ col["name"] }} = ''
14+
{% endset %}
15+
{% do elementary.run_query(update_query) %}
16+
{% endif %}
17+
{% endfor %}
18+
{% endmacro %}

integration_tests/dbt_project/models/exposures.yml

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,12 @@ exposures:
1515
owner:
1616
name: Callum McData
1717
18-
meta:
19-
referenced_columns:
20-
- column_name: id
21-
data_type: numeric
22-
node: ref('customers')
18+
config:
19+
meta:
20+
referenced_columns:
21+
- column_name: id
22+
data_type: numeric
23+
node: ref('customers')
2324

2425
- name: orders
2526
label: Returned Orders
@@ -35,8 +36,9 @@ exposures:
3536
owner:
3637
name: Callum McData
3738
38-
meta:
39-
referenced_columns:
40-
- column_name: "order_id"
41-
data_type: "string"
42-
- column_name: "ZOMG"
39+
config:
40+
meta:
41+
referenced_columns:
42+
- column_name: "order_id"
43+
data_type: "string"
44+
- column_name: "ZOMG"

integration_tests/dbt_project/models/schema.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ models:
55
description: This table has basic information about a customer, as well as some derived facts based on a customer's orders
66
tests:
77
- elementary.exposure_schema_validity:
8-
tags: [exposure_customers]
8+
config:
9+
tags: [exposure_customers]
910

1011
columns:
1112
- name: id
@@ -20,7 +21,8 @@ models:
2021

2122
tests:
2223
- elementary.exposure_schema_validity:
23-
tags: [exposure_orders]
24+
config:
25+
tags: [exposure_orders]
2426

2527
columns:
2628
- name: order_id

0 commit comments

Comments
 (0)