-
Notifications
You must be signed in to change notification settings - Fork 121
Ele 4724 dbt fusion support #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds dbt‑Fusion support across CI and test harness, centralizes CREATE TABLE AS via new edr_create_table_as macros, replaces direct use of context["model"] with a Fusion-aware test accessor, introduces Fusion-aware serialization/normalization helpers, and applies many test skips/config adjustments for Fusion runs. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor GH as GitHub Actions
participant WF as CI Workflow
participant Inst as Fusion Installer
participant DBT as dbt CLI
GH->>WF: start job (matrix: dbt-version, warehouse)
alt dbt-version == "fusion"
WF->>Inst: install dbt‑fusion
WF->>WF: PROFILES_YML = CI_PROFILES_YML_FUSION
WF->>DBT: use ~/.local/bin/dbt for deps/debug
WF->>DBT: run pytest with --runner-method fusion
else
WF->>WF: install standard dbt
WF->>WF: PROFILES_YML = CI_PROFILES_YML
WF->>DBT: use standard dbt binary
WF->>DBT: run pytest (no runner-method)
end
sequenceDiagram
autonumber
participant Caller as Macro caller
participant EDR as edr_create_table_as
participant SQL as edr_get_create_table_as_sql
participant Adapter as Adapter builder
participant Exec as elementary.run_query
participant Cmt as adapter.commit
Caller->>EDR: edr_create_table_as(temporary, relation, sql_query, drop_first?, should_commit?)
alt drop_first == true
EDR->>Adapter: drop relation if exists
end
EDR->>SQL: request adapter-specific CREATE TABLE AS SQL
SQL->>Adapter: build statement
Adapter-->>EDR: return SQL statement
EDR->>Exec: elementary.run_query(statement)
alt should_commit == true
EDR->>Cmt: adapter.commit()
end
EDR-->>Caller: return relation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
|
👋 @haritamar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
.github/workflows/test-warehouse.yml(2 hunks).gitignore(1 hunks)integration_tests/dbt_project/.gitignore(1 hunks)integration_tests/dbt_project/dbt_project.yml(0 hunks)integration_tests/dbt_project/macros/materializations.sql(1 hunks)integration_tests/dbt_project/models/exposures.yml(2 hunks)integration_tests/dbt_project/packages.yml(1 hunks)integration_tests/deprecated_tests/dbt_project.yml(0 hunks)integration_tests/deprecated_tests/macros/system/materializations.sql(1 hunks)integration_tests/tests/conftest.py(5 hunks)integration_tests/tests/dbt_project.py(2 hunks)integration_tests/tests/env.py(1 hunks)macros/edr/dbt_artifacts/upload_dbt_invocation.sql(1 hunks)macros/edr/system/hooks/on_run_start.sql(0 hunks)macros/edr/system/system_utils/get_config_var.sql(0 hunks)macros/edr/system/system_utils/is_dbt_fusion.sql(1 hunks)macros/edr/tests/on_run_start/ensure_materialize_override.sql(0 hunks)macros/utils/graph/get_run_result_dict.sql(2 hunks)macros/utils/run_queries/agate_to_dict.sql(1 hunks)macros/utils/run_queries/agate_to_dicts.sql(1 hunks)models/run_results.yml(1 hunks)package-lock.yml(1 hunks)
💤 Files with no reviewable changes (5)
- integration_tests/dbt_project/dbt_project.yml
- macros/edr/system/hooks/on_run_start.sql
- integration_tests/deprecated_tests/dbt_project.yml
- macros/edr/system/system_utils/get_config_var.sql
- macros/edr/tests/on_run_start/ensure_materialize_override.sql
🧰 Additional context used
🧬 Code Graph Analysis (2)
integration_tests/tests/env.py (2)
integration_tests/tests/conftest.py (3)
dbt_project(132-135)target(139-140)runner_method(154-164)integration_tests/tests/dbt_project.py (1)
get_dbt_runner(33-42)
integration_tests/tests/dbt_project.py (1)
integration_tests/tests/conftest.py (2)
target(139-140)runner_method(154-164)
🔇 Additional comments (24)
.gitignore (1)
27-27: Addition looks goodIgnoring
dbt_internal_packages/is consistent with dbt-managed dependencies and keeps the repo clean.integration_tests/dbt_project/.gitignore (1)
3-3: Consistent ignore patternSame note as for the root
.gitignore– the addition is correct and harmless..github/workflows/test-warehouse.yml (1)
24-25: Temporary branch defaults should be removed before mergeBoth inputs now default to the feature branch
ele-4724-dbt_fusion-supportand are tagged# TEMPORARY.
If this hitsmain, future workflow runs will silently pin to a branch that will eventually be deleted, breaking CI.Diff to restore the previous behaviour (empty defaults) once the feature is merged:
- default: ele-4724-dbt_fusion-support # TEMPORARY + # default intentionally left blank – callers should provide a refApply the same change for the second occurrence.
Also applies to: 46-47
integration_tests/dbt_project/macros/materializations.sql (1)
3-6: Direct macro calls simplify the materialization – nice cleanupSwitching from
.call_macro()to direct calls is clearer and avoids the extra invocation layer. Verified that the call signatures stay the same.Also applies to: 11-17
integration_tests/deprecated_tests/macros/system/materializations.sql (1)
2-2: LGTM: Simplified macro invocation patternThe removal of
.call_macro()wrappers standardizes the macro calling convention and makes the code more direct and readable.Also applies to: 6-6
macros/edr/system/system_utils/is_dbt_fusion.sql (1)
1-7: LGTM: Clean implementation for fusion detectionThe logic correctly identifies dbt fusion versions by checking if the major version is greater than 1. The implementation assumes
dbt_versionfollows semantic versioning format, which is a reasonable assumption.integration_tests/tests/env.py (1)
1-2: LGTM: Proper integration of runner_method parameterThe addition of the
runner_methodparameter with proper typing and default value maintains backward compatibility while enabling the new fusion runner functionality. The parameter is correctly passed through to the dbt runner initialization.Also applies to: 8-16
integration_tests/dbt_project/packages.yml (1)
4-4: LGTM: Simplified version specificationThe change from dynamic Jinja logic to a fixed version range
">=0.8.0", "<2.0.0"simplifies the dependency specification while maintaining compatibility. The actual version pinning is handled by the package-lock.yml file for deterministic builds.macros/utils/graph/get_run_result_dict.sql (3)
6-6: LGTM: Consistent with agate_to_dict standardization.The replacement of
run_result.to_dict()withelementary.agate_to_dict(run_result)aligns with the broader effort to standardize agate table conversions for dbt Fusion compatibility.
12-12: LGTM: Proper handling of timing item conversion.The change to use
elementary.agate_to_dict(item)for timing items is consistent with the standardization approach.
23-23: LGTM: Correct conditional conversion for node objects.The conditional use of
elementary.agate_to_dict(run_result.node)when the node exists maintains the existing logic while using the standardized conversion method.integration_tests/tests/dbt_project.py (3)
11-11: LGTM: Proper import addition for RunnerMethod.The import of
RunnerMethodis correctly added to support the new runner method parameter.
33-42: LGTM: Well-structured function signature update.The addition of the optional
runner_methodparameter toget_dbt_runneris properly typed and maintains backward compatibility. The parameter is correctly passed through to thecreate_dbt_runnercall.
46-52: LGTM: Consistent constructor parameter propagation.The
DbtProjectconstructor properly accepts and propagates therunner_methodparameter to theget_dbt_runnercall, maintaining the parameter flow throughout the test infrastructure.macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1)
131-131: LGTM: Consistent agate_to_dict usage in get_all_vars.The replacement of
config.vars.to_dict()withelementary.agate_to_dict(config.vars)follows the standardized pattern for agate table conversions and ensures compatibility with dbt Fusion environments.macros/utils/run_queries/agate_to_dicts.sql (3)
2-7: LGTM: Well-implemented conditional logic for dbt Fusion support.The conditional logic properly handles the structural differences between dbt Fusion and regular dbt environments. In Fusion environments,
agate_tableis used directly, while in regular environmentsagate_table.rowsis accessed. This approach ensures compatibility across different dbt runtime environments.
14-14: LGTM: Improved column value serialization.The use of
elementary.agate_val_serialize(col_value)provides better handling of column value serialization, particularly for date-like objects that need ISO 8601 formatting.
23-28: LGTM: Useful helper macro for value serialization.The
agate_val_serializemacro provides proper serialization for date-like objects by converting them to ISO 8601 format when theyearattribute is defined. This is a clean and efficient approach to handle date serialization.integration_tests/tests/conftest.py (6)
4-4: LGTM: Proper imports for new functionality.The additions of
Optional,yaml,RunnerMethod, andDBT_FUSION_SUPPORTED_TARGETSprovide the necessary imports and constants for the enhanced test infrastructure.Also applies to: 7-7, 10-10, 16-16
25-25: LGTM: Command-line option for runner method selection.The addition of
--runner-methodcommand-line option enables users to specify the runner method for tests, providing flexibility in test execution.
29-29: LGTM: Well-structured project setup modifications.The
project_dir_copyfixture properly handles the runner method parameter and calls helper functions to modify the test project setup. The integration is clean and maintains the existing fixture structure.Also applies to: 38-41
47-76: LGTM: Comprehensive helper functions for project modification.The helper functions
_edit_packages_yml_to_include_absolute_elementary_package_pathand_remove_python_tests_for_dbt_fusionare well-implemented:
- The packages.yml editing ensures proper package path resolution
- The Python test removal for dbt Fusion is appropriate since Fusion may not support Python models
- Both functions include proper logging and error handling
79-86: LGTM: Proper fixture parameter propagation.The updates to
init_tests_envanddbt_projectfixtures correctly accept and propagate therunner_methodparameter, ensuring consistent handling throughout the test infrastructure.Also applies to: 132-135
153-164: LGTM: Robust validation for dbt Fusion target compatibility.The
runner_methodfixture properly validates that dbt Fusion is only used with supported targets (snowflake, bigquery, databricks), raising a clear error message for unsupported combinations. This prevents runtime issues and provides clear feedback to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/workflows/test-all-warehouses.yml (1)
63-63: Add adefaultfield to the input declaration for consistency
elementary-refnow has an inline default at the call-site, but its definition under bothworkflow_dispatchandworkflow_callstill lacks adefault:key. This creates an implicit contract that isn’t visible to consumers or to workflow-linting tools.Adding an explicit default to the input block keeps the contract self-documented and prevents accidental divergence if someone copies the input list elsewhere.
elementary-ref: type: string required: false + default: ele-4724-dbt_fusion-support description: Branch or tag to checkout for 'elementary' repository
| {% set artifacts_hashes = {} %} | ||
| {% for artifacts_model, metadata_hashes in artifact_agate_hashes.items() %} | ||
| {% do artifacts_hashes.update({artifacts_model: metadata_hashes.columns["metadata_hash"]}) %} | ||
| {% set artifacts_hashes = elementary.agate_to_dicts(artifacts_hashes_results) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fusion's agate doesn't support group_by, so I implemented it differently
| {% elif invocation_args_dict and invocation_args_dict.selector_name %} | ||
| {% do return(invocation_args_dict.selector_name) %} | ||
| {% elif invocation_args_dict and invocation_args_dict.INVOCATION_COMMAND %} | ||
| {% set match = modules.re.search( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
selector is not available in fusion, so getting it with a regex from the command
| {% set db_name_to_type = {} %} | ||
| {% for row in elementary.agate_to_dicts(elementary.run_query(dremio_databases_query)) %} | ||
| {% if row["database_name"] | lower not in configured_dbs %} | ||
| {% continue %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that newer versions of Fusion don't like "continue", so reimplemented the loop without them (there's a ticket about it though)
| {% endmacro %} | ||
|
|
||
| {%- macro redshift__get_default_incremental_strategy() %} | ||
| {% do return("merge") %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is something that broke with delete+insert strategy on redshift in fusion (that seemed to be the default in fusion for some reason) - and I didn't think it matters too much so just made it merge.
I forgot exactly what was the issue but it didn't seem related to our changes.
|
|
||
| {% macro is_data_type_of_normalized_type(data_type, normalized_type) %} | ||
| {% set data_type_list = elementary.data_type_list(normalized_type) | map('lower') %} | ||
| {% do return(data_type | lower in data_type_list) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the normalization case-insensitive - as dbt-fusion seems to lower-case the types (at least in Snowflake)
| {% for agate_row in rows.data %} | ||
| {% set serializable_row = {} %} | ||
| {% for col_name, col_value in agate_row.items() %} | ||
| {% set serializable_col_value = column_types[loop.index0].jsonify(col_value) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fusion agate doesn't have jsonify - so implemented serialization differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
macros/edr/dbt_artifacts/upload_dbt_tests.sql (1)
14-45: Add compiled_code to dbt_tests empty table schemacompiled_code is written in upload_dbt_tests (line 180) but missing in get_dbt_tests_empty_table_query, causing a schema/dict mismatch. Add it after original_path:
macros/edr/dbt_artifacts/upload_dbt_tests.sql {% set dbt_tests_empty_table_query = elementary.empty_table([ - ('original_path', 'long_string'), + ('original_path', 'long_string'), + ('compiled_code', 'long_string'), ('path', 'string'), ('generated_at', 'string'), ('metadata_hash', 'string'), ('quality_dimension', 'string'), ('group_name', 'string'), ]) %}macros/edr/dbt_artifacts/upload_dbt_invocation.sql (2)
88-100: Bug: SELECT uppercase path is unreachableThe outer condition requires invocation_args_dict.select, so the inner elif for invocation_args_dict.SELECT never runs. This breaks Fusion where only SELECT may be present.
Apply:
-{%- if invocation_args_dict and invocation_args_dict.select -%} - {%- if invocation_args_dict.select -%} - {{- return(invocation_args_dict.select) -}} - {%- elif invocation_args_dict.SELECT -%} - {{- return(invocation_args_dict.SELECT) -}} - {%- endif -%} +{%- if invocation_args_dict -%} + {%- if invocation_args_dict.select -%} + {{- return(invocation_args_dict.select) -}} + {%- elif invocation_args_dict.SELECT -%} + {{- return(invocation_args_dict.SELECT) -}} + {%- endif -%} {%- elif config.args and config.args.select -%} {{- return(config.args.select) -}} {%- else -%} {{- return([]) -}} {%- endif -%}
126-140: VARS handling: parse consistently like varsIf invocation_args_dict.VARS is a string (YAML), it should be parsed; otherwise you return a raw string.
-{% elif invocation_args_dict and invocation_args_dict.VARS %} - {% set invocation_vars = invocation_args_dict.VARS %} +{% elif invocation_args_dict and invocation_args_dict.VARS %} + {% if invocation_args_dict.VARS is mapping %} + {% set invocation_vars = invocation_args_dict.VARS %} + {% else %} + {% set invocation_vars = fromyaml(invocation_args_dict.VARS) %} + {% endif %}
♻️ Duplicate comments (1)
macros/utils/table_operations/create_table_as.sql (1)
14-16: Switch todbt.commit()for adapter portability.
adapter.commit()still raises on engines without explicit transaction support (BigQuery, Spark family), so any call withshould_commit=truewill blow up there.dbt.commit()wraps the adapter capability check for us—please swap to it here.- {% if should_commit %} - {% do adapter.commit() %} - {% endif %} + {% if should_commit %} + {% do dbt.commit() %} + {% endif %}
🧹 Nitpick comments (10)
.gitignore (1)
28-28: Drop the duplicate ignore entry.
dbt_internal_packages/is already ignored at Line 3, so this repeat line is redundant. Please remove it to keep the list tidy.integration_tests/tests/test_disable_samples_config.py (1)
141-164: Clarify the test's purpose in the docstring.The docstring states "Test that disable_test_samples excludes only the disabled column", but the test actually verifies that
disable_test_samples: Trueprevents all sampling (assertinglen(samples) == 0). Thedisable_test_samplesconfiguration is test-level, not column-level.Consider updating the docstring to accurately reflect the test's behavior:
- """Test that disable_test_samples excludes only the disabled column""" + """Test that disable_test_samples prevents all sampling when set to True"""integration_tests/tests/conftest.py (1)
165-176: LGTM! Consider refactoring error message.The fixture correctly validates the runner method and target compatibility. The static analysis tool suggests moving the long error message outside the exception class for better maintainability.
Apply this diff to address the Ruff hint:
+FUSION_UNSUPPORTED_TARGET_ERROR = "Fusion runner is not supported for target: {}" + @pytest.fixture(scope="session") def runner_method(request, target: str) -> Optional[RunnerMethod]: runner_method_str = request.config.getoption("--runner-method") if runner_method_str: runner_method = RunnerMethod(runner_method_str) if ( runner_method == RunnerMethod.FUSION and target not in DBT_FUSION_SUPPORTED_TARGETS ): - raise ValueError(f"Fusion runner is not supported for target: {target}") + raise ValueError(FUSION_UNSUPPORTED_TARGET_ERROR.format(target)) return runner_method return None.github/workflows/test-all-warehouses.yml (1)
59-66: Prevent duplicate matrix jobs; hardcode fusion in include.Using "${{ inputs.dbt-version || 'fusion' }}" here will mirror any provided input version and duplicate jobs already produced by the main axis. If the goal is “always add Fusion runs,” set dbt-version to the literal fusion.
- - dbt-version: "${{ inputs.dbt-version || 'fusion' }}" + - dbt-version: fusion warehouse-type: snowflake - - dbt-version: "${{ inputs.dbt-version || 'fusion' }}" + - dbt-version: fusion warehouse-type: bigquery - - dbt-version: "${{ inputs.dbt-version || 'fusion' }}" + - dbt-version: fusion warehouse-type: redshift - - dbt-version: "${{ inputs.dbt-version || 'fusion' }}" + - dbt-version: fusion warehouse-type: databricks_catalogmacros/edr/dbt_artifacts/upload_dbt_tests.sql (3)
61-63: Meta precedence: confirm intended override order (node meta > config meta)You build unified_meta by updating with config_meta_dict, then meta_dict, so node meta overrides config meta. Confirm this is desired for tests in Fusion and classic. If config meta should take precedence, reverse the update order.
65-72: Avoid persisting description inside meta to prevent duplicationCurrently, description is popped from unified_meta only when sourced from unified_meta. If dbt_version >= 1.9 and node_dict.description is used, any description lingering in unified_meta will still be written to meta, duplicating the value in two fields.
Pop description from unified_meta unconditionally after selecting the final description:
{% set description = none %} {% if dbt_version >= '1.9.0' and node_dict.get('description') %} {% set description = node_dict.get('description') %} - {% elif unified_meta.get('description') %} - {% set description = unified_meta.pop('description') %} + {% elif unified_meta.get('description') %} + {% set description = unified_meta.pop('description') %} {% elif default_description %} {% set description = default_description %} {% endif %} + {# Ensure meta doesn't redundantly store description #} + {% if 'description' in unified_meta %} + {% do unified_meta.pop('description') %} + {% endif %}Also applies to: 68-69
76-79: Normalize meta.tags to a list before unionmeta.tags might be provided as a string. Normalize to a list to keep union_lists robust.
- {% set meta_tags = elementary.safe_get_with_default(unified_meta, 'tags', []) %} - {% set tags = elementary.union_lists(config_tags, global_tags) %} - {% set tags = elementary.union_lists(tags, meta_tags) %} + {% set raw_meta_tags = elementary.safe_get_with_default(unified_meta, 'tags', []) %} + {% set meta_tags = [] %} + {% if raw_meta_tags %} + {% if raw_meta_tags is string %} + {% set meta_tags = [raw_meta_tags] %} + {% elif raw_meta_tags is sequence %} + {% set meta_tags = raw_meta_tags %} + {% endif %} + {% endif %} + {% set tags = elementary.union_lists(config_tags, global_tags) %} + {% set tags = elementary.union_lists(tags, meta_tags) %}macros/utils/run_queries/run_query.sql (1)
23-33: Snowflake semicolon strip is appropriateTrimming and removing a trailing semicolon avoids Fusion issues. Consider applying similar trailing-semicolon stripping in the default formatter for consistency across adapters.
macros/utils/run_queries/agate_to_dicts.sql (1)
22-46: Decimal-to-float precision loss riskConverting non-integer Decimals to float can lose precision. Prefer preserving as string (or keep Decimal for downstream) to avoid data drift in artifacts.
Apply this change to serialize non-integer decimals as strings:
{% macro edr_serialize_decimal(val) %} {% set dec_tuple = val.normalize().as_tuple() %} {# A hacky way to standardize Decimals which are not JSON-serializable #} {% if dec_tuple[2] == 0 %} {% do return(val | string | int) %} {% else %} - {% do return(val | string | float) %} + {# preserve precision #} + {% do return(val | string) %} {% endif %} {% endmacro %}macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1)
109-118: Selector regex: support “--selector=val” and quoted valuesCurrent pattern matches only “--selector ”. Support equals form and quotes.
-{% set match = modules.re.search( - "--selector\s+(\S+)", - invocation_args_dict.INVOCATION_COMMAND -) %} +{% set match = modules.re.search( + r"--selector(?:=|\s+)(?:['\"])?(?P<sel>[^'\s\"]+)", + invocation_args_dict.INVOCATION_COMMAND +) %} -{% if match %} - {% do return(match.group(1)) %} +{% if match %} + {% do return(match.group('sel')) %} {% else %} {% do return(none) %} {% endif %}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (75)
.github/workflows/test-all-warehouses.yml(1 hunks).github/workflows/test-warehouse.yml(1 hunks).gitignore(1 hunks)integration_tests/dbt_project/.gitignore(1 hunks)integration_tests/dbt_project/dbt_project.yml(0 hunks)integration_tests/dbt_project/macros/create_all_types_table.sql(4 hunks)integration_tests/dbt_project/macros/materializations.sql(1 hunks)integration_tests/dbt_project/macros/replace_empty_strings_with_nulls.sql(1 hunks)integration_tests/dbt_project/models/exposures.yml(2 hunks)integration_tests/dbt_project/models/schema.yml(2 hunks)integration_tests/dbt_project/packages.yml(1 hunks)integration_tests/deprecated_tests/dbt_project.yml(0 hunks)integration_tests/deprecated_tests/macros/system/materializations.sql(1 hunks)integration_tests/tests/conftest.py(6 hunks)integration_tests/tests/dbt_project.py(8 hunks)integration_tests/tests/env.py(1 hunks)integration_tests/tests/pytest.ini(1 hunks)integration_tests/tests/test_all_columns_anomalies.py(3 hunks)integration_tests/tests/test_anomaly_test_configuration.py(1 hunks)integration_tests/tests/test_collect_metrics.py(1 hunks)integration_tests/tests/test_column_anomalies.py(1 hunks)integration_tests/tests/test_column_pii_sampling.py(1 hunks)integration_tests/tests/test_dbt_artifacts/test_artifacts.py(2 hunks)integration_tests/tests/test_dbt_artifacts/test_columns.py(1 hunks)integration_tests/tests/test_dbt_artifacts/test_groups.py(9 hunks)integration_tests/tests/test_dimension_anomalies.py(1 hunks)integration_tests/tests/test_disable_samples_config.py(6 hunks)integration_tests/tests/test_exposure_schema_validity.py(7 hunks)integration_tests/tests/test_failed_row_count.py(4 hunks)integration_tests/tests/test_jsonschema.py(1 hunks)integration_tests/tests/test_long_strings.py(3 hunks)integration_tests/tests/test_python.py(1 hunks)integration_tests/tests/test_volume_anomalies.py(1 hunks)macros/edr/dbt_artifacts/upload_dbt_artifacts.sql(1 hunks)macros/edr/dbt_artifacts/upload_dbt_invocation.sql(4 hunks)macros/edr/dbt_artifacts/upload_dbt_models.sql(1 hunks)macros/edr/dbt_artifacts/upload_dbt_tests.sql(5 hunks)macros/edr/materializations/test/test.sql(1 hunks)macros/edr/system/configuration/is_elementary_enabled.sql(1 hunks)macros/edr/system/hooks/on_run_end.sql(1 hunks)macros/edr/system/hooks/on_run_start.sql(1 hunks)macros/edr/system/system_utils/get_config_var.sql(0 hunks)macros/edr/system/system_utils/get_pii_columns_from_parent_model.sql(1 hunks)macros/edr/system/system_utils/get_test_model.sql(1 hunks)macros/edr/system/system_utils/is_dbt_fusion.sql(1 hunks)macros/edr/tests/on_run_end/handle_tests_results.sql(4 hunks)macros/edr/tests/on_run_start/ensure_materialize_override.sql(0 hunks)macros/edr/tests/test_ai_data_validation.sql(1 hunks)macros/edr/tests/test_all_columns_anomalies.sql(2 hunks)macros/edr/tests/test_collect_metrics.sql(1 hunks)macros/edr/tests/test_column_anomalies.sql(2 hunks)macros/edr/tests/test_dimension_anomalies.sql(2 hunks)macros/edr/tests/test_event_freshness_anomalies.sql(1 hunks)macros/edr/tests/test_exposure_schema_validity.sql(1 hunks)macros/edr/tests/test_schema_changes.sql(2 hunks)macros/edr/tests/test_schema_changes_from_baseline.sql(2 hunks)macros/edr/tests/test_table_anomalies.sql(2 hunks)macros/edr/tests/test_utils/create_elementary_test_table.sql(1 hunks)macros/edr/tests/test_utils/get_model_relation_for_test.sql(1 hunks)macros/edr/tests/test_utils/validate_unique_metric_names.sql(1 hunks)macros/utils/cross_db_utils/get_user_creation_query.sql(1 hunks)macros/utils/cross_db_utils/incremental_strategy.sql(1 hunks)macros/utils/data_types/normalize_data_type.sql(1 hunks)macros/utils/graph/get_result_node.sql(1 hunks)macros/utils/graph/get_run_result_dict.sql(2 hunks)macros/utils/run_queries/agate_to_dict.sql(1 hunks)macros/utils/run_queries/agate_to_dicts.sql(1 hunks)macros/utils/run_queries/run_query.sql(1 hunks)macros/utils/table_operations/create_intermediate_relation.sql(1 hunks)macros/utils/table_operations/create_or_replace.sql(1 hunks)macros/utils/table_operations/create_table_as.sql(1 hunks)macros/utils/table_operations/create_table_like.sql(1 hunks)macros/utils/table_operations/create_temp_table.sql(1 hunks)macros/utils/table_operations/replace_table_data.sql(1 hunks)package-lock.yml(1 hunks)
💤 Files with no reviewable changes (4)
- integration_tests/dbt_project/dbt_project.yml
- macros/edr/tests/on_run_start/ensure_materialize_override.sql
- macros/edr/system/system_utils/get_config_var.sql
- integration_tests/deprecated_tests/dbt_project.yml
✅ Files skipped from review due to trivial changes (2)
- macros/edr/materializations/test/test.sql
- package-lock.yml
🚧 Files skipped from review as they are similar to previous changes (14)
- macros/edr/tests/test_utils/validate_unique_metric_names.sql
- macros/utils/table_operations/replace_table_data.sql
- macros/edr/tests/test_dimension_anomalies.sql
- integration_tests/dbt_project/packages.yml
- macros/edr/system/system_utils/is_dbt_fusion.sql
- integration_tests/tests/env.py
- macros/edr/system/system_utils/get_test_model.sql
- macros/utils/run_queries/agate_to_dict.sql
- macros/utils/table_operations/create_table_like.sql
- integration_tests/dbt_project/models/exposures.yml
- integration_tests/dbt_project/macros/create_all_types_table.sql
- macros/utils/table_operations/create_or_replace.sql
- macros/edr/tests/test_schema_changes_from_baseline.sql
- integration_tests/dbt_project/.gitignore
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
📚 Learning: 2025-09-04T09:14:40.621Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/edr/tests/test_dimension_anomalies.sql:73-75
Timestamp: 2025-09-04T09:14:40.621Z
Learning: In the dbt-data-reliability codebase, `flatten_model` is a valid macro defined in `macros/edr/dbt_artifacts/upload_dbt_models.sql` and can be used as a parameter to `elementary.get_anomaly_query()` even though other test files use the `flattened_test` variable instead.
Applied to files:
macros/edr/dbt_artifacts/upload_dbt_models.sqlmacros/edr/tests/test_column_anomalies.sqlmacros/edr/tests/test_table_anomalies.sqlmacros/edr/tests/test_all_columns_anomalies.sql
📚 Learning: 2025-07-22T15:41:57.338Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
Applied to files:
macros/edr/system/hooks/on_run_end.sqlmacros/utils/table_operations/create_table_as.sql
📚 Learning: 2025-07-15T19:28:35.548Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/run_queries/agate_to_dict.sql:1-7
Timestamp: 2025-07-15T19:28:35.548Z
Learning: In dbt-data-reliability, the `agate_to_dict` macro in `macros/utils/run_queries/agate_to_dict.sql` intentionally returns different types based on fusion/non-fusion mode to handle breaking changes in dbt Fusion. In fusion mode, it returns the original `agate_table`, while in non-fusion mode it returns `agate_table.to_dict()`. This is because the content/structure of `agate_table` is different between the two modes.
Applied to files:
macros/utils/run_queries/agate_to_dicts.sql
🧬 Code graph analysis (15)
integration_tests/tests/test_long_strings.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_python.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_dimension_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
test(110-128)test(131-149)test(151-257)
integration_tests/tests/test_dbt_artifacts/test_artifacts.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_all_columns_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
test(110-128)test(131-149)test(151-257)
integration_tests/tests/test_failed_row_count.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_exposure_schema_validity.py (2)
integration_tests/tests/conftest.py (3)
skip_for_dbt_fusion(123-126)dbt_project(144-147)target(151-152)integration_tests/tests/dbt_project.py (4)
DbtProject(46-334)test(110-128)test(131-149)test(151-257)
integration_tests/tests/test_jsonschema.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_anomaly_test_configuration.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_collect_metrics.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/test_volume_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
test(110-128)test(131-149)test(151-257)
integration_tests/tests/test_column_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
test(110-128)test(131-149)test(151-257)
integration_tests/tests/dbt_project.py (1)
integration_tests/tests/conftest.py (2)
target(151-152)runner_method(166-176)
integration_tests/tests/test_dbt_artifacts/test_groups.py (1)
integration_tests/tests/conftest.py (1)
skip_for_dbt_fusion(123-126)
integration_tests/tests/conftest.py (2)
integration_tests/tests/dbt_project.py (1)
DbtProject(46-334)integration_tests/tests/env.py (1)
Environment(7-26)
🪛 Ruff (0.13.2)
integration_tests/tests/conftest.py
174-174: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (latest_official, redshift) / test
🔇 Additional comments (66)
integration_tests/tests/test_column_pii_sampling.py (1)
373-373: Validate nested meta tags support or update test
I didn’t find any implementation references toconfig.meta.tags; ensure the code supports nested meta tags or update this test to useconfig.tags.integration_tests/dbt_project/macros/materializations.sql (1)
1-20: LGTM! Simplified macro invocation pattern.Replacing
.call_macro()with direct function calls is a cleaner approach and aligns with Fusion compatibility requirements. The logic and control flow remain unchanged.macros/edr/tests/test_utils/get_model_relation_for_test.sql (1)
7-15: LGTM! Fusion-aware early-exit path for single-dependency tests.The new logic correctly checks for a single dependency in
test_node.depends_on.nodes, retrieves the node, derives the relation, and returns it if found. This improves reliability and aligns with Fusion's model resolution needs. The fallback to the existing string-based logic is preserved.integration_tests/tests/test_column_anomalies.py (1)
115-138: LGTM! Cleaner test configuration with explicit test_config.The updates replace mutable parameter dictionaries with direct use of
DBT_TEST_ARGSand introducetest_configfor where-clause filtering. This approach is more explicit and aligns with the broader test infrastructure changes.integration_tests/deprecated_tests/macros/system/materializations.sql (1)
1-8: LGTM! Consistent with updated invocation pattern.Direct function calls replace
.call_macro()invocations, matching the pattern applied elsewhere. The logic is unchanged and the approach is cleaner.integration_tests/tests/test_volume_anomalies.py (1)
77-100: LGTM! Explicit test configuration aligns with updated test API.The changes use
DBT_TEST_ARGSdirectly and passtest_configfor where-clause filtering, matching the pattern applied in related tests. This approach is cleaner and more maintainable.integration_tests/tests/test_dimension_anomalies.py (1)
147-166: LGTM! Consistent test configuration pattern.The updates use
DBT_TEST_ARGSdirectly and introducetest_configfor where-clause filtering, aligning with the pattern applied across the test suite. The logic is clear and correct..github/workflows/test-warehouse.yml (4)
133-137: LGTM! Fusion installation step is correct.The remote installer invocation is standard for Fusion setup and aligns with the conditional logic guarding the step.
150-156: LGTM! Fusion-specific dependency handling is correct.The conditional installation of dependencies, symlinking of the Elementary package, and requirements installation for Fusion are logically sound and correctly scoped.
160-160: LGTM! Fusion-specific dbt binary path is correct.The conditional path selection (
~/.local/bin/dbtfor Fusion,dbtotherwise) is correct and aligns with Fusion's installation location.
164-164: LGTM! Fusion runner-method flag is correct.The conditional
--runner-method fusionflag is correctly applied only when Fusion is the selected dbt version.integration_tests/tests/dbt_project.py (5)
34-43: LGTM! Correct runner_method plumbing.The
get_dbt_runnerfunction now accepts and forwardsrunner_methodtocreate_dbt_runner, enabling Fusion-specific runner selection. The logic is correct.
47-56: LGTM! DbtProject correctly stores runner_method.The constructor now accepts and stores
runner_method, and passes it toget_dbt_runner. This enables Fusion-specific behavior downstream.
188-201: LGTM! Correct test_config integration.The
test_yamlconstruction now includes an optionalconfigblock fromtest_config, and the structure is correctly applied to both table-level and column-level test paths. The logic is sound.
265-272: LGTM! Reasonable workaround for Fusion BigQuery seed quirk.The
_fix_seed_if_neededmethod applies a Fusion-specific fix for BigQuery's seed behavior (empty strings instead of nulls). The conditional check is correct and the workaround is scoped appropriately.
330-334: LGTM! Safer cleanup with try/finally.Wrapping
path.unlink()in afinallyblock ensures the temporary file is cleaned up even if an exception occurs during theyield. This is a good reliability improvement.integration_tests/tests/conftest.py (8)
16-21: LGTM!The constant clearly defines supported Fusion targets.
30-30: LGTM!The CLI option is properly defined.
34-49: LGTM!The fixture properly threads
runner_methodthrough and invokes the new helper functions to prepare the test environment.
52-67: Verify packages.yml structure assumptions.The code assumes
packages_yml["packages"][0]["local"]exists. If the packages.yml file is malformed or missing the expected structure, this will raise a KeyError or IndexError.Consider adding defensive checks or documenting the expected structure:
packages_yml_path = Path(project_dir_copy) / "packages.yml" with packages_yml_path.open("r") as packages_yml_file: packages_yml = yaml.safe_load(packages_yml_file) + if not packages_yml.get("packages") or not packages_yml["packages"]: + raise ValueError("packages.yml must contain a non-empty 'packages' list") + if "local" not in packages_yml["packages"][0]: + raise ValueError("First package must have a 'local' key") + packages_yml["packages"][0]["local"] = str( (DBT_PROJECT_PATH / packages_yml["packages"][0]["local"]).resolve() )
70-80: LGTM!The function correctly removes Python test files only when the runner method is Fusion.
84-90: LGTM!The fixture properly passes
runner_methodto the Environment initialization.
122-126: LGTM!The autouse fixture correctly implements the skip logic for Fusion tests.
144-147: LGTM!The fixture properly passes
runner_methodto DbtProject initialization.integration_tests/tests/test_all_columns_anomalies.py (2)
62-62: LGTM!The function name now accurately reflects that it tests the
whereparameter.
96-128: LGTM!The test correctly uses
test_configto pass thewhereclause, aligning with the updated test infrastructure pattern.integration_tests/tests/test_dbt_artifacts/test_columns.py (1)
62-62: LGTM!Changing to set equality makes the assertion order-insensitive, which is more robust and appropriate when the order of columns doesn't matter.
integration_tests/tests/pytest.ini (1)
6-6: LGTM!The marker is properly declared for use across the test suite.
integration_tests/tests/test_long_strings.py (2)
1-1: LGTM!The pytest import is necessary for the skip marker.
21-21: LGTM!The skip markers are correctly applied to exclude these tests from Fusion runs.
Also applies to: 40-40
integration_tests/tests/test_collect_metrics.py (1)
153-153: LGTM!The skip marker is correctly applied to exclude this test from Fusion runs.
integration_tests/tests/test_anomaly_test_configuration.py (1)
117-117: LGTM!The skip marker is correctly applied to exclude this test from Fusion runs.
integration_tests/tests/test_python.py (1)
13-13: LGTM!The class-level skip marker appropriately excludes all Python tests from Fusion runs, as Python tests are not supported in Fusion.
macros/edr/tests/test_collect_metrics.sql (1)
29-29: LGTM! Test model resolution updated for Fusion compatibility.The change from
context["model"]toelementary.get_test_model()aligns with the PR's broader effort to support dbt Fusion. The abstraction ensures consistent test model resolution across different execution contexts.integration_tests/tests/test_jsonschema.py (1)
17-17: LGTM! Fusion skip marker applied correctly.The
@pytest.mark.skip_for_dbt_fusiondecorator correctly gates the entireTestJsonschemaclass from executing under Fusion contexts, aligning with the project-wide Fusion skip mechanism.macros/edr/tests/test_column_anomalies.sql (2)
4-4: LGTM! Model relation resolution updated for Fusion.Consistent with the PR-wide pattern, replacing
context["model"]withelementary.get_test_model()ensures proper test model resolution in Fusion environments.
95-95: LGTM! Test flattening updated for Fusion compatibility.The
flatten_testcall now useselementary.get_test_model()instead ofcontext["model"], aligning with the broader Fusion support changes.Based on learnings
integration_tests/tests/test_failed_row_count.py (4)
9-9: LGTM! Fusion skip applied consistently.The
@pytest.mark.skip_for_dbt_fusionmarker is correctly applied, ensuring this test skips under Fusion contexts alongside the existing ClickHouse skip.
27-27: LGTM! Fusion skip marker applied.Correctly gates
test_sum_failed_row_countfrom Fusion execution.
47-47: LGTM! Fusion skip applied consistently.The decorator is correctly positioned to skip this test under Fusion, complementing the existing ClickHouse skip.
67-67: LGTM! Fusion skip marker applied.Correctly gates
test_warn_if_0from Fusion execution.macros/edr/tests/test_exposure_schema_validity.sql (1)
19-19: LGTM! Model resolution updated for Fusion compatibility.The change to
elementary.get_test_model()ensures proper test model resolution in Fusion environments, consistent with other test macros in this PR.integration_tests/tests/test_exposure_schema_validity.py (1)
12-12: LGTM! Fusion skip markers applied consistently.All eight exposure schema validity test functions are correctly decorated with
@pytest.mark.skip_for_dbt_fusion, ensuring they skip under Fusion contexts.macros/edr/tests/test_ai_data_validation.sql (1)
4-4: LGTM! Model resolution updated for Fusion compatibility.The change to
elementary.get_test_model()aligns with the PR-wide pattern for supporting dbt Fusion.macros/edr/tests/test_schema_changes.sql (2)
4-4: LGTM! Model relation resolution updated for Fusion.The change to
elementary.get_test_model()ensures proper test model resolution in Fusion environments.
33-33: LGTM! Test flattening updated for Fusion compatibility.The
flatten_testcall now useselementary.get_test_model(), aligning with the broader Fusion support changes.integration_tests/tests/test_dbt_artifacts/test_artifacts.py (2)
99-99: LGTM!The
skip_for_dbt_fusionmarker is correctly applied to skip this test for Fusion runtime, where source freshness results are not yet supported.
119-127: LGTM!The config structure has been correctly updated to nest
loaded_at_fieldandfreshnessunder aconfigblock, aligning with dbt's standard configuration patterns for sources.macros/edr/tests/test_table_anomalies.sql (2)
4-4: LGTM!The test model resolution has been correctly updated to use the centralized
elementary.get_test_model()helper, improving Fusion compatibility and making test model access explicit.
85-85: LGTM!The
flatten_testcall has been correctly updated to useelementary.get_test_model(), maintaining consistency with the broader refactor pattern across test macros.macros/edr/tests/test_event_freshness_anomalies.sql (1)
4-4: LGTM!The test model resolution has been correctly updated to use
elementary.get_test_model(), maintaining consistency with the PR-wide refactor for Fusion compatibility.integration_tests/tests/test_dbt_artifacts/test_groups.py (2)
91-91: LGTM!The
skip_for_dbt_fusionmarkers are selectively applied to tests that are not yet supported in Fusion runtime (model groups, two groups, and snapshot groups), while other artifact tests (test, seed) remain enabled for Fusion.Also applies to: 166-166, 462-462
112-114: LGTM!The config structure has been consistently updated across all artifact types (models, seeds, snapshots) to nest
groupunder aconfigblock, aligning with dbt's standard configuration patterns.Also applies to: 188-190, 195-197, 274-276, 347-349, 413-415, 487-489
macros/edr/tests/test_all_columns_anomalies.sql (1)
4-4: LGTM!The test model resolution has been correctly updated to use
elementary.get_test_model()in bothget_model_relation_for_testandflatten_testcalls, maintaining consistency with the PR-wide refactor for Fusion compatibility.Also applies to: 96-96
macros/edr/dbt_artifacts/upload_dbt_artifacts.sql (1)
54-62: LGTM!The manual aggregation correctly replaces
group_byfunctionality to work around Fusion's agate limitations. The logic properly builds a per-model mapping of metadata hashes, and the return structure matches caller expectations with appropriate edge-case handling.macros/edr/system/configuration/is_elementary_enabled.sql (1)
2-3: Confirm compatibility with adapters lacking databases.
elementary.get_package_database_and_schema()[0]can beNonefor supported adapters that only define schemas (e.g., Spark without Unity Catalog). In that case this macro would report Elementary as disabled even though it should remain active. Please confirm that the helper never returns(None, <schema>), or broaden the check to fall back to the schema as well..github/workflows/test-all-warehouses.yml (1)
59-66: Fusion path fully supported in test-warehouse.yml. The workflow conditionally installs dbt-fusion, selects the fusion profiles YML, and applies fusion-specific binary paths and test flags.macros/edr/dbt_artifacts/upload_dbt_tests.sql (3)
136-137: Group resolution LGTMResolution order (config > node > tested model) is sensible and aligns with expected precedence.
Also applies to: 145-145
169-170: Unified meta + quality_dimension + group_name inclusion LGTMWriting unified_meta into meta, preferring meta.quality_dimension, and adding group_name look correct.
Also applies to: 183-185
194-195: generic_tests path handling LGTMTreating generic_tests/ as generic matches Fusion layouts.
macros/edr/dbt_artifacts/upload_dbt_models.sql (1)
71-71: Checksum guard looks goodThe mapping check prevents crashes when checksum is not a dict; safe fallback to null is appropriate.
macros/utils/graph/get_run_result_dict.sql (1)
6-7: Fusion-aware serialization switchUsing elementary.agate_to_dict for run_result, timing items, and node aligns with Fusion’s differing object shapes. Callers must accept dicts in classic dbt and original objects in Fusion. Looks correct.
Also applies to: 12-13, 23-24
macros/utils/table_operations/create_intermediate_relation.sql (1)
2-12: Explicit incorporate path avoids Fusion None-path pitfallsCreating the intermediate relation via edr_make_intermediate_relation then incorporating with an explicit path is a good cross-adapter fix.
macros/edr/tests/on_run_end/handle_tests_results.sql (2)
28-51: Status normalization is solidAnomaly detection adjustment and Fusion status normalization (error->fail, success->pass) are sensible and minimal.
141-142: Switch to edr_create_table_asCreating temp staging via edr_create_table_as is consistent with Fusion-safe table ops.
Also applies to: 188-190
macros/utils/table_operations/create_temp_table.sql (1)
11-12: Consolidation via edr_create_table_asUsing edr_create_table_as(True, temp_table_relation, sql_query, drop_first=temp_table_exists) correctly flags a temporary table and simplifies cross-adapter handling.
integration_tests/dbt_project/macros/replace_empty_strings_with_nulls.sql
Show resolved
Hide resolved
| {% for result in results %} | ||
| {% set result = elementary.get_run_result_dict(result) %} | ||
| {% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %} | ||
| {% if node.resource_type == 'test' %} | ||
| {% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %} | ||
| {% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crash risk: unsafe unique_id/node access
If result.node is None (common on some failures) and result.unique_id is not present (>=1.8 path), accessing result.node.unique_id will raise. Also the computed node variable isn’t used for cache lookups later.
Use guarded resolution and rely on node variable:
-{% set result = elementary.get_run_result_dict(result) %}
-{% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %}
+{% set result = elementary.get_run_result_dict(result) %}
+{# Prefer node from result if present; otherwise resolve by unique_id if available #}
+{% set node = result.node if result.node else none %}
+{% if not node and result.unique_id is defined and result.unique_id %}
+ {% set node = elementary.get_node(result.unique_id) %}
+{% endif %}
+{% if not node %}
+ {# As a last resort, skip non-resolvable results safely #}
+ {% continue %}
+{% endif %}Then below, use node.unique_id for cache access (see next comment).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {% for result in results %} | |
| {% set result = elementary.get_run_result_dict(result) %} | |
| {% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %} | |
| {% if node.resource_type == 'test' %} | |
| {% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %} | |
| {% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %} | |
| {% for result in results %} | |
| {% set result = elementary.get_run_result_dict(result) %} | |
| {# Prefer node from result if present; otherwise resolve by unique_id if available #} | |
| {% set node = result.node if result.node else none %} | |
| {% if not node and result.unique_id is defined and result.unique_id %} | |
| {% set node = elementary.get_node(result.unique_id) %} | |
| {% endif %} | |
| {% if not node %} | |
| {# As a last resort, skip non-resolvable results safely #} | |
| {% continue %} | |
| {% endif %} | |
| {% if node.resource_type == 'test' %} | |
| {% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %} | |
| {% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %} |
🤖 Prompt for AI Agents
In macros/edr/tests/on_run_end/handle_tests_results.sql around lines 56 to 62,
the code directly accesses result.node.unique_id which will crash if result.node
is None and result.unique_id is missing; change the node resolution to a guarded
form (e.g., set node = result.node or elementary.get_node(result.unique_id or
result.node.unique_id)) but simpler and safer: resolve node once with node =
result.node or elementary.get_node(result.unique_id), ensure node is not None
before using it, and then use node.unique_id for all subsequent cache lookups
(cached_elementary_test_results and cached_elementary_test_failed_row_counts)
instead of accessing result.node.unique_id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from what I saw this never happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haritamar, thanks for clarifying! If you've verified through testing that result.node is always populated after get_run_result_dict, then the crash scenario is indeed unlikely.
That said, there's still a consistency issue in the code: you compute the node variable but then still use result.node.unique_id for the cache lookups a few lines later:
{% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %}For cleaner, more maintainable code, consider using node.unique_id consistently since you've already resolved the node:
{% set elementary_test_results_rows = cached_elementary_test_results.get(node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(node.unique_id) %}This makes the intent clearer and ensures you're using the same node reference throughout.
| {%- macro redshift__get_default_incremental_strategy() %} | ||
| {% do return("merge") %} | ||
| {% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert Redshift default to avoid forced merge without unique_key.
Switching the Redshift default to merge forces every incremental model using our helpers to supply a unique_key; dbt-redshift fails compilation at runtime if merge is chosen without one. Today those same models succeed because the adapter default (delete+insert) does not require a key. This change would therefore break existing Redshift incremental models that rely on the default path. Please keep the default at delete+insert, or only dispatch to merge when a unique_key is present.
🤖 Prompt for AI Agents
In macros/utils/cross_db_utils/incremental_strategy.sql around lines 13 to 15,
the macro currently returns "merge" unconditionally which forces Redshift
incremental models to require a unique_key and breaks existing models; change
the macro to return "delete+insert" as the default for Redshift, or implement
conditional logic to return "merge" only when a unique_key is provided
(otherwise return "delete+insert") so existing models without unique_key
continue to compile and run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this since these method are used exclusively for our dbt artifact tables which all have unique keys (also the default for other adapters is merge in this file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haritamar, thank you for the clarification! That resolves my concern. Since get_default_incremental_strategy() is used exclusively for your internal artifact tables and they all have unique keys defined, switching Redshift to merge is safe and consistent with the other adapters in this file. I appreciate the context!
✏️ Learnings added
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/cross_db_utils/incremental_strategy.sql:13-15
Timestamp: 2025-10-05T16:25:01.698Z
Learning: In macros/utils/cross_db_utils/incremental_strategy.sql, the get_default_incremental_strategy() macro and its adapter-specific variants are used exclusively for Elementary's internal dbt artifact tables, all of which have unique keys defined. Therefore, defaulting to "merge" strategy for Redshift (and other adapters) is safe in this context.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| {% set node = elementary.get_node(result.unique_id) %} | ||
| {% if node.identifier == identifier %} | ||
| {% if package_name %} | ||
| {% if result.node.package_name == package_name %} | ||
| {{ return(result.node) }} | ||
| {% if node.package_name == package_name %} | ||
| {{ return(node) }} | ||
| {% endif %} | ||
| {% else %} | ||
| {{ return(result.node) }} | ||
| {{ return(node) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against missing nodes before dereferencing
elementary.get_node can return none when the manifest does not have that unique_id (e.g., disabled/ephemeral nodes), so hitting Line 4 will raise 'None' has no attribute 'identifier'. Add a guard to skip those cases before touching attributes.
{% set node = elementary.get_node(result.unique_id) %}
+ {% if node is none %}
+ {% continue %}
+ {% endif %}
{% if node.identifier == identifier %}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {% set node = elementary.get_node(result.unique_id) %} | |
| {% if node.identifier == identifier %} | |
| {% if package_name %} | |
| {% if result.node.package_name == package_name %} | |
| {{ return(result.node) }} | |
| {% if node.package_name == package_name %} | |
| {{ return(node) }} | |
| {% endif %} | |
| {% else %} | |
| {{ return(result.node) }} | |
| {{ return(node) }} | |
| {% set node = elementary.get_node(result.unique_id) %} | |
| {% if node is none %} | |
| {% continue %} | |
| {% endif %} | |
| {% if node.identifier == identifier %} | |
| {% if package_name %} | |
| {% if node.package_name == package_name %} | |
| {{ return(node) }} | |
| {% endif %} | |
| {% else %} | |
| {{ return(node) }} |
🤖 Prompt for AI Agents
In macros/utils/graph/get_result_node.sql around lines 3 to 10, the template
dereferences node.identifier without ensuring elementary.get_node(...) returned
a node; when get_node returns none this raises an attribute error. Fix by
guarding for node existence before any attribute access (e.g., check node is not
none / truthy in the same conditional), then perform the identifier and optional
package_name comparisons and return only when node is present and matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as it shouldn't ever happen in this context, I prefer to fail violently here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haritamar, understood! I'll defer to your judgment on this. The explicit failure will make it immediately clear if something unexpected happens during development or testing.
✏️ Learnings added
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/graph/get_result_node.sql:3-10
Timestamp: 2025-10-05T16:26:50.625Z
Learning: In macros/utils/graph/get_result_node.sql, the maintainer prefers to fail violently (let exceptions propagate) rather than guard against none when elementary.get_node returns none, as it should never happen in that context.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| {% set config = elementary.get_runtime_config() %} | ||
| {%- if config.vars -%} | ||
| {% do all_vars.update(config.vars.to_dict()) %} | ||
| {% do all_vars.update(elementary.agate_to_dict(config.vars)) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config.vars is an agate? in regular dbt as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it's not, my mistake
It's a dbt object called VarProvider which has a to_dict method, and there are other similar objects.
I'll rename the macro
| 'unique_id': node_dict.get('unique_id'), | ||
| 'alias': node_dict.get('alias'), | ||
| 'checksum': checksum_dict.get('checksum'), | ||
| 'checksum': checksum_dict.get('checksum') if checksum_dict is mapping else none, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is it in fusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I don't remember but it looks like it's a dict now in fusion so fixing this
| {% set test_query = elementary.get_compiled_code(flattened_test) %} | ||
| {% set test_query_lower = test_query.lower() %} | ||
|
|
||
| {% set test_query_lower = sql.lower() %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont understand this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compiled code doesn't currently exist in the graph in dbt-fusion
ticket
but sql in this context is effectively the compiled code so it allows us to keep supporting the pii feature.
package-lock.yml
Outdated
| @@ -0,0 +1,5 @@ | |||
| packages: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not commit the package-lock file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1)
90-100: UppercaseSELECTpath never runsThe new
invocation_args_dict.SELECTbranch is wrapped inside an outer guard that already requiresinvocation_args_dict.selectto be truthy, so we’ll never reach it when only the uppercase key exists (Fusion case). This leaves Fusion without selector coverage. Please restructure the condition (e.g., checkif invocation_args_dictfirst, then branch on both keys) so the uppercase path actually executes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
.gitignore(1 hunks)integration_tests/tests/test_exposure_schema_validity.py(7 hunks)macros/edr/dbt_artifacts/upload_dbt_invocation.sql(4 hunks)macros/utils/graph/get_run_result_dict.sql(2 hunks)macros/utils/run_queries/dbt_object_to_dict.sql(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .gitignore
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
📚 Learning: 2025-07-15T19:28:35.548Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/run_queries/agate_to_dict.sql:1-7
Timestamp: 2025-07-15T19:28:35.548Z
Learning: In dbt-data-reliability, the `agate_to_dict` macro in `macros/utils/run_queries/agate_to_dict.sql` intentionally returns different types based on fusion/non-fusion mode to handle breaking changes in dbt Fusion. In fusion mode, it returns the original `agate_table`, while in non-fusion mode it returns `agate_table.to_dict()`. This is because the content/structure of `agate_table` is different between the two modes.
Applied to files:
macros/utils/run_queries/dbt_object_to_dict.sql
🧬 Code graph analysis (1)
integration_tests/tests/test_exposure_schema_validity.py (2)
integration_tests/tests/conftest.py (3)
skip_for_dbt_fusion(123-126)dbt_project(144-147)target(151-152)integration_tests/tests/dbt_project.py (4)
DbtProject(46-334)test(110-128)test(131-149)test(151-257)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: test
- GitHub Check: test (latest_official, trino) / test
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (latest_official, postgres) / test
- GitHub Check: test (latest_official, bigquery) / test
- GitHub Check: test (latest_official, dremio) / test
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (1.8.0, postgres) / test
🔇 Additional comments (2)
integration_tests/tests/test_exposure_schema_validity.py (2)
30-30: LGTM! Skip markers correctly applied.The
@pytest.mark.skip_for_dbt_fusiondecorators are appropriate given that exposure artifacts are not yet available in dbt-fusion (per past review comments and dbt-labs/dbt-fusion#13).Also applies to: 49-49, 57-57, 90-90, 123-123, 166-166, 190-190
61-63: Good refactor: Variable rename addresses past feedback.The rename from
explicit_target_for_bigquerytotarget_data_typeimproves clarity and addresses the past review comment that the old variable name didn't make sense. The logic now handles dremio's special case (reporting string columns as "other") while other targets report them as "string".Based on learnings
Also applies to: 73-73
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
integration_tests/tests/conftest.py (2)
52-67: Consider adding structure validation.The function directly accesses
packages_yml["packages"][0]["local"]without validating the structure exists. While this is acceptable for test code that will fail loudly if the structure is wrong, adding validation would improve robustness.Consider adding a check:
packages_yml_path = Path(project_dir_copy) / "packages.yml" with packages_yml_path.open("r") as packages_yml_file: packages_yml = yaml.safe_load(packages_yml_file) +if not packages_yml.get("packages") or not isinstance(packages_yml["packages"], list) or len(packages_yml["packages"]) == 0 or "local" not in packages_yml["packages"][0]: + raise ValueError("packages.yml does not have the expected structure") + packages_yml["packages"][0]["local"] = str( (DBT_PROJECT_PATH / packages_yml["packages"][0]["local"]).resolve() )
165-176: Consider extracting the error message to a constant.The fixture correctly validates that dbt-fusion is only used with supported targets. However, the static analysis tool suggests avoiding long error messages inline.
As per static analysis hint, consider extracting the message:
+FUSION_UNSUPPORTED_TARGET_MSG = "Fusion runner is not supported for target: {}" + @pytest.fixture(scope="session") def runner_method(request, target: str) -> Optional[RunnerMethod]: runner_method_str = request.config.getoption("--runner-method") if runner_method_str: runner_method = RunnerMethod(runner_method_str) if ( runner_method == RunnerMethod.FUSION and target not in DBT_FUSION_SUPPORTED_TARGETS ): - raise ValueError(f"Fusion runner is not supported for target: {target}") + raise ValueError(FUSION_UNSUPPORTED_TARGET_MSG.format(target)) return runner_method return None
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
integration_tests/tests/conftest.py(6 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
🧬 Code graph analysis (1)
integration_tests/tests/conftest.py (3)
integration_tests/tests/dbt_project.py (1)
DbtProject(46-334)integration_tests/tests/env.py (1)
Environment(7-26)integration_tests/tests/logger.py (1)
get_logger(5-11)
🪛 Ruff (0.13.3)
integration_tests/tests/conftest.py
174-174: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (1.8.0, postgres) / test
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (latest_official, dremio) / test
- GitHub Check: test (latest_official, postgres) / test
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_official, trino) / test
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, bigquery) / test
🔇 Additional comments (8)
integration_tests/tests/conftest.py (8)
4-4: LGTM!The new imports (
Optional,yaml,RunnerMethod) are appropriate and necessary for the dbt-fusion support functionality.Also applies to: 7-7, 10-10
16-21: LGTM!The supported targets list aligns with the PR objectives for dbt-fusion support on Snowflake, BigQuery, Redshift, and Databricks adapters.
30-30: LGTM!The
--runner-methodcommand-line option follows pytest conventions and enables CLI control of the runner method for tests.
34-49: LGTM!The fixture correctly accepts
runner_methodand orchestrates the necessary post-copy operations for dbt-fusion compatibility. The cleanup is properly handled in the finally block.
70-80: LGTM!The function correctly removes Python models only when using dbt-fusion, and the scope is appropriately limited to the models directory.
84-103: LGTM!The fixture correctly accepts and passes
runner_methodto theEnvironmentconstructor, consistent with the changes inenv.py.
122-126: LGTM!The autouse fixture correctly implements test skipping for dbt-fusion using standard pytest marker patterns.
144-147: LGTM!The fixture correctly accepts and passes
runner_methodto theDbtProjectconstructor, consistent with the changes indbt_project.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
macros/edr/dbt_artifacts/upload_dbt_invocation.sql(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (latest_official, bigquery) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_official, postgres) / test
- GitHub Check: test (latest_official, trino) / test
- GitHub Check: test (latest_official, dremio) / test
- GitHub Check: test (latest_official, clickhouse) / test
- GitHub Check: test (1.8.0, postgres) / test
- GitHub Check: test (latest_official, athena) / test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
macros/edr/tests/on_run_end/handle_tests_results.sql (1)
58-58: Guard node resolution to avoid crashes
This reintroduces the unsafe access the previous review flagged. Whenresult.nodeisNoneandresult.unique_idis falsy/absent (seen on certain dbt>=1.8 error paths), the expression still evaluatesresult.node.unique_id, causing an exception instead of skipping the unresolved result. Resolve the node defensively before dereferencing and bail out if you still can’t.- {% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %} + {% set node = result.node if result.node else none %} + {% if not node and result.unique_id is defined and result.unique_id %} + {% set node = elementary.get_node(result.unique_id) %} + {% endif %} + {% if not node %} + {% continue %} + {% endif %}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
.github/workflows/test-warehouse.yml(1 hunks)macros/edr/tests/on_run_end/handle_tests_results.sql(4 hunks)macros/utils/cross_db_utils/get_user_creation_query.sql(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: test
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, bigquery) / test
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (1.8.0, postgres) / test
- GitHub Check: test (latest_official, trino) / test
- GitHub Check: test (latest_official, dremio) / test
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_official, clickhouse) / test
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (latest_official, postgres) / test
| ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project | ||
| ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary | ||
| pip install -r requirements.txt | ||
| - name: Check DWH connection | ||
| working-directory: ${{ env.TESTS_DIR }} | ||
| run: | | ||
| dbt debug -t "${{ inputs.warehouse-type }}" | ||
| ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix dbt‑fusion binary path
install.sh drops the fusion CLI at ~/.dbt/bin/dbt, not ~/.local/bin/dbt. With the current hard‑coded path, both deps and debug will fail at runtime (command not found). Please point the conditional to the actual install location (or dynamically resolve the path) before we merge.
- ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
+ ${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
@@
- ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
+ ${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project | |
| ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary | |
| pip install -r requirements.txt | |
| - name: Check DWH connection | |
| working-directory: ${{ env.TESTS_DIR }} | |
| run: | | |
| dbt debug -t "${{ inputs.warehouse-type }}" | |
| ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}" | |
| ${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} deps --project-dir dbt_project | |
| ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary | |
| pip install -r requirements.txt | |
| - name: Check DWH connection | |
| working-directory: ${{ env.TESTS_DIR }} | |
| run: | | |
| ${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}" |
🤖 Prompt for AI Agents
In .github/workflows/test-warehouse.yml around lines 153 to 160, the workflow
hard-codes the dbt‑fusion CLI path as ~/.local/bin/dbt but install.sh places the
fusion binary at ~/.dbt/bin/dbt; update both conditional uses (the deps and
debug commands) to point to ~/.dbt/bin/dbt (or replace the inline conditional
with a small resolution step that checks for ~/.dbt/bin/dbt first and falls back
to dbt) so the correct executable is invoked at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is wrong in this case, it's definitely installed in "~/.local/bin/dbt" and the CI currently works for fusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Fetch the dbt-fusion install script to verify the installation path
curl -fsSL https://public.cdn.getdbt.com/fs/install/install.sh
Length of output: 19381
@haritamar, you're absolutely right—I apologize for the confusion. After reviewing the install script, I can confirm that the default installation path is indeed ~/.local/bin/dbt, not ~/.dbt/bin/dbt. The script sets:
dest="$HOME/.local/bin"Since your workflow only passes --update to the installer, it uses this default location. Your current implementation is correct, and I was mistaken. Thank you for clarifying that the CI is working as expected! I'll resolve this comment thread.
🐰💭
(Got that one wrong—
learning from humans!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
macros/edr/dbt_artifacts/upload_dbt_invocation.sql(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_official, bigquery) / test
- GitHub Check: test (1.8.0, postgres) / test
- GitHub Check: test (latest_official, dremio) / test
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (latest_official, trino) / test
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (latest_official, postgres) / test
| {%- if invocation_args_dict -%} | ||
| {%- if invocation_args_dict.select -%} | ||
| {{- return(invocation_args_dict.select) -}} | ||
| {%- elif invocation_args_dict.SELECT -%} | ||
| {{- return(invocation_args_dict.SELECT) -}} | ||
| {%- endif -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore fallback when no select keys are present.
If invocation_args_dict exists but neither .select nor .SELECT is populated, the macro now hits the outer branch and exits without a return, so selected becomes an empty string instead of honoring config.args.select or falling back to []. This regresses the previous behavior and drops selection metadata in fusion/core runs that only set invocation_args_dict. Please restore the fallback inside the outer branch.
-{%- if invocation_args_dict -%}
- {%- if invocation_args_dict.select -%}
- {{- return(invocation_args_dict.select) -}}
- {%- elif invocation_args_dict.SELECT -%}
- {{- return(invocation_args_dict.SELECT) -}}
- {%- endif -%}
-{%- elif config.args and config.args.select -%}
+{%- if invocation_args_dict -%}
+ {%- if invocation_args_dict.select -%}
+ {{- return(invocation_args_dict.select) -}}
+ {%- elif invocation_args_dict.SELECT -%}
+ {{- return(invocation_args_dict.SELECT) -}}
+ {%- elif config.args and config.args.select -%}
+ {{- return(config.args.select) -}}
+ {%- else -%}
+ {{- return([]) -}}
+ {%- endif -%}
+{%- elif config.args and config.args.select -%}
{{- return(config.args.select) -}}
{%- else -%}
{{- return([]) -}}
{%- endif -%}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {%- if invocation_args_dict -%} | |
| {%- if invocation_args_dict.select -%} | |
| {{- return(invocation_args_dict.select) -}} | |
| {%- elif invocation_args_dict.SELECT -%} | |
| {{- return(invocation_args_dict.SELECT) -}} | |
| {%- endif -%} | |
| {%- if invocation_args_dict -%} | |
| {%- if invocation_args_dict.select -%} | |
| {{- return(invocation_args_dict.select) -}} | |
| {%- elif invocation_args_dict.SELECT -%} | |
| {{- return(invocation_args_dict.SELECT) -}} | |
| {%- elif config.args and config.args.select -%} | |
| {{- return(config.args.select) -}} | |
| {%- else -%} | |
| {{- return([]) -}} | |
| {%- endif -%} | |
| {%- elif config.args and config.args.select -%} | |
| {{- return(config.args.select) -}} | |
| {%- else -%} | |
| {{- return([]) -}} | |
| {%- endif -%} |
🤖 Prompt for AI Agents
In macros/edr/dbt_artifacts/upload_dbt_invocation.sql around lines 90-95,
restore the fallback inside the outer invocation_args_dict branch so that when
invocation_args_dict exists but neither .select nor .SELECT is present the macro
still returns config.args.select if available or an empty list; update the
conditional to explicitly return invocation_args_dict.select or
invocation_args_dict.SELECT when present, otherwise return config.args.select if
defined, else return ([]).
This PR adds support for
dbt-fusionto the Elementary dbt package.It will be supported from now on alongside dbt-core on all the fusion supported adapters (Snowflake, Bigquery, Redshift and Databricks).
General background on fusion
dbt-fusionis basically a complete rewrite ofdbtin Rust. It adds all kinds of capabilities (e.g. static analysis of SQL), but it mainly impacts us in the sense that a lot of things are breaking :)In particular a lot of things we relied on in the package (like certain context variables,
agatemethods, etc) are not supported.Also there are some dbt features that are not yet working in fusion and we have tickets about.
Some notes:
dbtintroduced breaking config changes that are only supported from dbt-core v1.10 - and it was a real headache to make our test support both config methods. So right now it means that our test won't work anymore on dbt <1.10(though the package still supports these versions.Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores