Skip to content

Conversation

@haritamar
Copy link
Collaborator

@haritamar haritamar commented Jul 15, 2025

This PR adds support for dbt-fusion to the Elementary dbt package.
It will be supported from now on alongside dbt-core on all the fusion supported adapters (Snowflake, Bigquery, Redshift and Databricks).

General background on fusion

dbt-fusion is basically a complete rewrite of dbt in Rust. It adds all kinds of capabilities (e.g. static analysis of SQL), but it mainly impacts us in the sense that a lot of things are breaking :)
In particular a lot of things we relied on in the package (like certain context variables, agate methods, etc) are not supported.
Also there are some dbt features that are not yet working in fusion and we have tickets about.

Some notes:

  • I got to a state where all the tests pass on both core and fusion. However, dbt introduced breaking config changes that are only supported from dbt-core v1.10 - and it was a real headache to make our test support both config methods. So right now it means that our test won't work anymore on dbt < 1.10 (though the package still supports these versions.
  • The docs branch is here (I let @NoyaOffer review).

Summary by CodeRabbit

  • New Features

    • dbt Fusion support in CI and test runner; unified, adapter-aware create-table-as behavior; per-model artifacts grouping; test-model accessor and replace-empty-strings utility.
  • Bug Fixes

    • Improved selector/vars/metadata parsing, test status normalization for Fusion, more robust value serialization, Snowflake query trimming, expanded tag/PII detection.
  • Tests

    • New runner-method option, many tests marked to skip for Fusion, standardized test_config usage for where/meta/tags.
  • Chores

    • CI workflow updates, .gitignore additions, integration test project/config adjustments.

@linear
Copy link

linear bot commented Jul 15, 2025

@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 17:22 — with GitHub Actions Failure
@coderabbitai
Copy link

coderabbitai bot commented Jul 15, 2025

Walkthrough

Adds dbt‑Fusion support across CI and test harness, centralizes CREATE TABLE AS via new edr_create_table_as macros, replaces direct use of context["model"] with a Fusion-aware test accessor, introduces Fusion-aware serialization/normalization helpers, and applies many test skips/config adjustments for Fusion runs.

Changes

Cohort / File(s) Summary
CI Workflows (Fusion)
.github/workflows/test-warehouse.yml, .github/workflows/test-all-warehouses.yml
Add fusion install step and conditional dbt install; select fusion-specific profiles; use fusion dbt binary (~/.local/bin/dbt) for deps/debug; expand matrix to include fusion/warehouse combos.
Gitignore
.gitignore, integration_tests/dbt_project/.gitignore
Add dbt_internal_packages/ and package-lock.yml ignore entries.
Integration test project config & packages
integration_tests/dbt_project/dbt_project.yml, integration_tests/dbt_project/packages.yml, integration_tests/dbt_project/models/schema.yml, integration_tests/dbt_project/models/exposures.yml
Remove mute_ensure_materialization_override; replace templated dbt_utils version constraint with a static list; move tags/meta under config/config.meta.
Integration test project macros
integration_tests/dbt_project/macros/materializations.sql, integration_tests/dbt_project/macros/create_all_types_table.sql, integration_tests/dbt_project/macros/replace_empty_strings_with_nulls.sql
Replace .call_macro() with direct calls; delegate table creation to elementary.edr_create_table_as; add replace_empty_strings_with_nulls macro.
Deprecated tests project
integration_tests/deprecated_tests/dbt_project.yml, integration_tests/deprecated_tests/macros/system/materializations.sql
Remove mute var; set seeds schema; replace .call_macro() usage.
Test harness & runner wiring
integration_tests/tests/conftest.py, integration_tests/tests/env.py, integration_tests/tests/dbt_project.py, integration_tests/tests/pytest.ini
Add --runner-method option and runner_method fixture; propagate RunnerMethod through Environment/DbtProject; add skip_for_dbt_fusion marker; edit packages.yml to absolute elementary path; remove Python models for Fusion; add post-seed Fusion+BigQuery fix.
Integration tests edits (many)
integration_tests/tests/*.py (multiple files)
Add @pytest.mark.skip_for_dbt_fusion where applicable; move meta/tags/where into test_config/test_vars; use DBT_TEST_ARGS directly; rename a few tests; relax assertion ordering.
Fusion detection & test-model accessor
macros/edr/system/system_utils/is_dbt_fusion.sql, macros/edr/system/system_utils/get_test_model.sql
Add is_dbt_fusion() (major-version check) and get_test_model() to return the test model object.
Replace context["model"] usage
macros/edr/tests/*.sql, macros/edr/tests/test_utils/*.sql
Replace context["model"] with elementary.get_test_model() across test macros; add single-dependent-node fast path in get_model_relation_for_test; sanitize temp table naming.
Artifacts ingestion & metadata
macros/edr/dbt_artifacts/upload_dbt_invocation.sql, macros/edr/dbt_artifacts/upload_dbt_artifacts.sql, macros/edr/dbt_artifacts/upload_dbt_tests.sql, macros/edr/dbt_artifacts/flatten_node.sql
Support alternate invocation args (SELECT/INVOCATION_COMMAND/VARS), use dbt_object_to_dict, return per-model artifact-hash mapping, merge config+node meta into unified_meta, add flatten_node.
Run/query & serialization utilities
macros/utils/run_queries/agate_to_dicts.sql, macros/utils/run_queries/dbt_object_to_dict.sql, macros/utils/run_queries/run_query.sql
Fusion-aware agate iteration; add agate_val_serialize and decimal helpers; add dbt_object_to_dict; trim trailing semicolons for Snowflake metadata formatting.
Graph/result utilities
macros/utils/graph/get_run_result_dict.sql, macros/utils/graph/get_result_node.sql
Use elementary.dbt_object_to_dict for serializing dbt objects; resolve nodes via elementary.get_node(result.unique_id) instead of result.node.
Centralized table operations
macros/utils/table_operations/*.sql (create_table_as, create_or_replace, create_table_like, create_temp_table, replace_table_data, create_intermediate_relation)
Add edr_create_table_as and adapter-specific SQL builders; refactor CREATE TABLE AS flows to call edr_create_table_as; change intermediate relation construction to explicit incorporate with path.
Test results handling
macros/edr/tests/on_run_end/handle_tests_results.sql
Add get_normalized_test_status (dbt_fusion mappings), materialize per-result rows when needed, normalize statuses, and use edr_create_table_as for inserts.
Materialization & hooks
macros/edr/materializations/test/test.sql, macros/edr/system/hooks/on_run_start.sql, macros/edr/system/hooks/on_run_end.sql, macros/edr/system/system_utils/get_config_var.sql
Add early-return guards when Elementary not enabled; remove ensure_materialize_override usage and mute var; materialization test uses sql.lower() directly; default config key removed.
Removed macro
macros/edr/tests/on_run_start/ensure_materialize_override.sql
Deleted ensure_materialize_override macro.
Data types, PII & cross-db tweaks
macros/utils/data_types/normalize_data_type.sql, macros/edr/system/system_utils/get_pii_columns_from_parent_model.sql, macros/utils/cross_db_utils/get_user_creation_query.sql, macros/utils/cross_db_utils/incremental_strategy.sql
Add is_data_type_of_normalized_type helper; include config.meta.tags as a tag source; refine Dremio DB mapping loop; add redshift__get_default_incremental_strategy().
Miscellaneous small edits
various files
Cleanup finally blocks, switch to elementary.* helpers, minor renames, fixture/marker additions, assertion adjustments, and other local refactors.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor GH as GitHub Actions
  participant WF as CI Workflow
  participant Inst as Fusion Installer
  participant DBT as dbt CLI
  GH->>WF: start job (matrix: dbt-version, warehouse)
  alt dbt-version == "fusion"
    WF->>Inst: install dbt‑fusion
    WF->>WF: PROFILES_YML = CI_PROFILES_YML_FUSION
    WF->>DBT: use ~/.local/bin/dbt for deps/debug
    WF->>DBT: run pytest with --runner-method fusion
  else
    WF->>WF: install standard dbt
    WF->>WF: PROFILES_YML = CI_PROFILES_YML
    WF->>DBT: use standard dbt binary
    WF->>DBT: run pytest (no runner-method)
  end
Loading
sequenceDiagram
  autonumber
  participant Caller as Macro caller
  participant EDR as edr_create_table_as
  participant SQL as edr_get_create_table_as_sql
  participant Adapter as Adapter builder
  participant Exec as elementary.run_query
  participant Cmt as adapter.commit
  Caller->>EDR: edr_create_table_as(temporary, relation, sql_query, drop_first?, should_commit?)
  alt drop_first == true
    EDR->>Adapter: drop relation if exists
  end
  EDR->>SQL: request adapter-specific CREATE TABLE AS SQL
  SQL->>Adapter: build statement
  Adapter-->>EDR: return SQL statement
  EDR->>Exec: elementary.run_query(statement)
  alt should_commit == true
    EDR->>Cmt: adapter.commit()
  end
  EDR-->>Caller: return relation
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested reviewers

  • ofek1weiss

Poem

In a burrow of macros I hop and I hum,
I wire Fusion paths where old calls felt glum.
edr_create_table_as builds with a hop and a spin,
tests skip or adapt while serializers grin.
Carrots for CI, and a rabbitly win! 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly indicates the primary purpose of the PR—adding dbt-fusion support—and is concise and specific to the changeset.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ele-4724-dbt_fusion-support

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

👋 @haritamar
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e4842d and 3c86bf8.

📒 Files selected for processing (22)
  • .github/workflows/test-warehouse.yml (2 hunks)
  • .gitignore (1 hunks)
  • integration_tests/dbt_project/.gitignore (1 hunks)
  • integration_tests/dbt_project/dbt_project.yml (0 hunks)
  • integration_tests/dbt_project/macros/materializations.sql (1 hunks)
  • integration_tests/dbt_project/models/exposures.yml (2 hunks)
  • integration_tests/dbt_project/packages.yml (1 hunks)
  • integration_tests/deprecated_tests/dbt_project.yml (0 hunks)
  • integration_tests/deprecated_tests/macros/system/materializations.sql (1 hunks)
  • integration_tests/tests/conftest.py (5 hunks)
  • integration_tests/tests/dbt_project.py (2 hunks)
  • integration_tests/tests/env.py (1 hunks)
  • macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1 hunks)
  • macros/edr/system/hooks/on_run_start.sql (0 hunks)
  • macros/edr/system/system_utils/get_config_var.sql (0 hunks)
  • macros/edr/system/system_utils/is_dbt_fusion.sql (1 hunks)
  • macros/edr/tests/on_run_start/ensure_materialize_override.sql (0 hunks)
  • macros/utils/graph/get_run_result_dict.sql (2 hunks)
  • macros/utils/run_queries/agate_to_dict.sql (1 hunks)
  • macros/utils/run_queries/agate_to_dicts.sql (1 hunks)
  • models/run_results.yml (1 hunks)
  • package-lock.yml (1 hunks)
💤 Files with no reviewable changes (5)
  • integration_tests/dbt_project/dbt_project.yml
  • macros/edr/system/hooks/on_run_start.sql
  • integration_tests/deprecated_tests/dbt_project.yml
  • macros/edr/system/system_utils/get_config_var.sql
  • macros/edr/tests/on_run_start/ensure_materialize_override.sql
🧰 Additional context used
🧬 Code Graph Analysis (2)
integration_tests/tests/env.py (2)
integration_tests/tests/conftest.py (3)
  • dbt_project (132-135)
  • target (139-140)
  • runner_method (154-164)
integration_tests/tests/dbt_project.py (1)
  • get_dbt_runner (33-42)
integration_tests/tests/dbt_project.py (1)
integration_tests/tests/conftest.py (2)
  • target (139-140)
  • runner_method (154-164)
🔇 Additional comments (24)
.gitignore (1)

27-27: Addition looks good

Ignoring dbt_internal_packages/ is consistent with dbt-managed dependencies and keeps the repo clean.

integration_tests/dbt_project/.gitignore (1)

3-3: Consistent ignore pattern

Same note as for the root .gitignore – the addition is correct and harmless.

.github/workflows/test-warehouse.yml (1)

24-25: Temporary branch defaults should be removed before merge

Both inputs now default to the feature branch ele-4724-dbt_fusion-support and are tagged # TEMPORARY.
If this hits main, future workflow runs will silently pin to a branch that will eventually be deleted, breaking CI.

Diff to restore the previous behaviour (empty defaults) once the feature is merged:

-        default: ele-4724-dbt_fusion-support # TEMPORARY
+        # default intentionally left blank – callers should provide a ref

Apply the same change for the second occurrence.

Also applies to: 46-47

integration_tests/dbt_project/macros/materializations.sql (1)

3-6: Direct macro calls simplify the materialization – nice cleanup

Switching from .call_macro() to direct calls is clearer and avoids the extra invocation layer. Verified that the call signatures stay the same.

Also applies to: 11-17

integration_tests/deprecated_tests/macros/system/materializations.sql (1)

2-2: LGTM: Simplified macro invocation pattern

The removal of .call_macro() wrappers standardizes the macro calling convention and makes the code more direct and readable.

Also applies to: 6-6

macros/edr/system/system_utils/is_dbt_fusion.sql (1)

1-7: LGTM: Clean implementation for fusion detection

The logic correctly identifies dbt fusion versions by checking if the major version is greater than 1. The implementation assumes dbt_version follows semantic versioning format, which is a reasonable assumption.

integration_tests/tests/env.py (1)

1-2: LGTM: Proper integration of runner_method parameter

The addition of the runner_method parameter with proper typing and default value maintains backward compatibility while enabling the new fusion runner functionality. The parameter is correctly passed through to the dbt runner initialization.

Also applies to: 8-16

integration_tests/dbt_project/packages.yml (1)

4-4: LGTM: Simplified version specification

The change from dynamic Jinja logic to a fixed version range ">=0.8.0", "<2.0.0" simplifies the dependency specification while maintaining compatibility. The actual version pinning is handled by the package-lock.yml file for deterministic builds.

macros/utils/graph/get_run_result_dict.sql (3)

6-6: LGTM: Consistent with agate_to_dict standardization.

The replacement of run_result.to_dict() with elementary.agate_to_dict(run_result) aligns with the broader effort to standardize agate table conversions for dbt Fusion compatibility.


12-12: LGTM: Proper handling of timing item conversion.

The change to use elementary.agate_to_dict(item) for timing items is consistent with the standardization approach.


23-23: LGTM: Correct conditional conversion for node objects.

The conditional use of elementary.agate_to_dict(run_result.node) when the node exists maintains the existing logic while using the standardized conversion method.

integration_tests/tests/dbt_project.py (3)

11-11: LGTM: Proper import addition for RunnerMethod.

The import of RunnerMethod is correctly added to support the new runner method parameter.


33-42: LGTM: Well-structured function signature update.

The addition of the optional runner_method parameter to get_dbt_runner is properly typed and maintains backward compatibility. The parameter is correctly passed through to the create_dbt_runner call.


46-52: LGTM: Consistent constructor parameter propagation.

The DbtProject constructor properly accepts and propagates the runner_method parameter to the get_dbt_runner call, maintaining the parameter flow throughout the test infrastructure.

macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1)

131-131: LGTM: Consistent agate_to_dict usage in get_all_vars.

The replacement of config.vars.to_dict() with elementary.agate_to_dict(config.vars) follows the standardized pattern for agate table conversions and ensures compatibility with dbt Fusion environments.

macros/utils/run_queries/agate_to_dicts.sql (3)

2-7: LGTM: Well-implemented conditional logic for dbt Fusion support.

The conditional logic properly handles the structural differences between dbt Fusion and regular dbt environments. In Fusion environments, agate_table is used directly, while in regular environments agate_table.rows is accessed. This approach ensures compatibility across different dbt runtime environments.


14-14: LGTM: Improved column value serialization.

The use of elementary.agate_val_serialize(col_value) provides better handling of column value serialization, particularly for date-like objects that need ISO 8601 formatting.


23-28: LGTM: Useful helper macro for value serialization.

The agate_val_serialize macro provides proper serialization for date-like objects by converting them to ISO 8601 format when the year attribute is defined. This is a clean and efficient approach to handle date serialization.

integration_tests/tests/conftest.py (6)

4-4: LGTM: Proper imports for new functionality.

The additions of Optional, yaml, RunnerMethod, and DBT_FUSION_SUPPORTED_TARGETS provide the necessary imports and constants for the enhanced test infrastructure.

Also applies to: 7-7, 10-10, 16-16


25-25: LGTM: Command-line option for runner method selection.

The addition of --runner-method command-line option enables users to specify the runner method for tests, providing flexibility in test execution.


29-29: LGTM: Well-structured project setup modifications.

The project_dir_copy fixture properly handles the runner method parameter and calls helper functions to modify the test project setup. The integration is clean and maintains the existing fixture structure.

Also applies to: 38-41


47-76: LGTM: Comprehensive helper functions for project modification.

The helper functions _edit_packages_yml_to_include_absolute_elementary_package_path and _remove_python_tests_for_dbt_fusion are well-implemented:

  • The packages.yml editing ensures proper package path resolution
  • The Python test removal for dbt Fusion is appropriate since Fusion may not support Python models
  • Both functions include proper logging and error handling

79-86: LGTM: Proper fixture parameter propagation.

The updates to init_tests_env and dbt_project fixtures correctly accept and propagate the runner_method parameter, ensuring consistent handling throughout the test infrastructure.

Also applies to: 132-135


153-164: LGTM: Robust validation for dbt Fusion target compatibility.

The runner_method fixture properly validates that dbt Fusion is only used with supported targets (snowflake, bigquery, databricks), raising a clear error message for unsupported combinations. This prevents runtime issues and provides clear feedback to users.

@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:15 — with GitHub Actions Failure
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/test-all-warehouses.yml (1)

63-63: Add a default field to the input declaration for consistency

elementary-ref now has an inline default at the call-site, but its definition under both workflow_dispatch and workflow_call still lacks a default: key. This creates an implicit contract that isn’t visible to consumers or to workflow-linting tools.

Adding an explicit default to the input block keeps the contract self-documented and prevents accidental divergence if someone copies the input list elsewhere.

     elementary-ref:
       type: string
       required: false
+      default: ele-4724-dbt_fusion-support
       description: Branch or tag to checkout for 'elementary' repository
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c86bf8 and c05a7cd.

📒 Files selected for processing (1)
  • .github/workflows/test-all-warehouses.yml (1 hunks)

@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:25 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:25 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:25 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:25 — with GitHub Actions Failure
@haritamar haritamar had a problem deploying to elementary_test_env July 15, 2025 19:25 — with GitHub Actions Failure
{% set artifacts_hashes = {} %}
{% for artifacts_model, metadata_hashes in artifact_agate_hashes.items() %}
{% do artifacts_hashes.update({artifacts_model: metadata_hashes.columns["metadata_hash"]}) %}
{% set artifacts_hashes = elementary.agate_to_dicts(artifacts_hashes_results) %}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fusion's agate doesn't support group_by, so I implemented it differently

{% elif invocation_args_dict and invocation_args_dict.selector_name %}
{% do return(invocation_args_dict.selector_name) %}
{% elif invocation_args_dict and invocation_args_dict.INVOCATION_COMMAND %}
{% set match = modules.re.search(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selector is not available in fusion, so getting it with a regex from the command

{% set db_name_to_type = {} %}
{% for row in elementary.agate_to_dicts(elementary.run_query(dremio_databases_query)) %}
{% if row["database_name"] | lower not in configured_dbs %}
{% continue %}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that newer versions of Fusion don't like "continue", so reimplemented the loop without them (there's a ticket about it though)

{% endmacro %}

{%- macro redshift__get_default_incremental_strategy() %}
{% do return("merge") %}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something that broke with delete+insert strategy on redshift in fusion (that seemed to be the default in fusion for some reason) - and I didn't think it matters too much so just made it merge.

I forgot exactly what was the issue but it didn't seem related to our changes.


{% macro is_data_type_of_normalized_type(data_type, normalized_type) %}
{% set data_type_list = elementary.data_type_list(normalized_type) | map('lower') %}
{% do return(data_type | lower in data_type_list) %}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the normalization case-insensitive - as dbt-fusion seems to lower-case the types (at least in Snowflake)

{% for agate_row in rows.data %}
{% set serializable_row = {} %}
{% for col_name, col_value in agate_row.items() %}
{% set serializable_col_value = column_types[loop.index0].jsonify(col_value) %}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fusion agate doesn't have jsonify - so implemented serialization differently.

@haritamar haritamar marked this pull request as ready for review October 2, 2025 14:16
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
macros/edr/dbt_artifacts/upload_dbt_tests.sql (1)

14-45: Add compiled_code to dbt_tests empty table schema

compiled_code is written in upload_dbt_tests (line 180) but missing in get_dbt_tests_empty_table_query, causing a schema/dict mismatch. Add it after original_path:

 macros/edr/dbt_artifacts/upload_dbt_tests.sql
 {% set dbt_tests_empty_table_query = elementary.empty_table([
-    ('original_path', 'long_string'),
+    ('original_path', 'long_string'),
+    ('compiled_code', 'long_string'),
     ('path', 'string'),
     ('generated_at', 'string'),
     ('metadata_hash', 'string'),
     ('quality_dimension', 'string'),
     ('group_name', 'string'),
 ]) %}
macros/edr/dbt_artifacts/upload_dbt_invocation.sql (2)

88-100: Bug: SELECT uppercase path is unreachable

The outer condition requires invocation_args_dict.select, so the inner elif for invocation_args_dict.SELECT never runs. This breaks Fusion where only SELECT may be present.

Apply:

-{%- if invocation_args_dict and invocation_args_dict.select -%}
-    {%- if invocation_args_dict.select -%}
-        {{- return(invocation_args_dict.select) -}}
-    {%- elif invocation_args_dict.SELECT -%}
-        {{- return(invocation_args_dict.SELECT) -}}
-    {%- endif -%}
+{%- if invocation_args_dict -%}
+    {%- if invocation_args_dict.select -%}
+        {{- return(invocation_args_dict.select) -}}
+    {%- elif invocation_args_dict.SELECT -%}
+        {{- return(invocation_args_dict.SELECT) -}}
+    {%- endif -%}
 {%- elif config.args and config.args.select -%}
     {{- return(config.args.select) -}}
 {%- else -%}
     {{- return([]) -}}
 {%- endif -%}

126-140: VARS handling: parse consistently like vars

If invocation_args_dict.VARS is a string (YAML), it should be parsed; otherwise you return a raw string.

-{% elif invocation_args_dict and invocation_args_dict.VARS %}
-    {% set invocation_vars = invocation_args_dict.VARS %}
+{% elif invocation_args_dict and invocation_args_dict.VARS %}
+    {% if invocation_args_dict.VARS is mapping %}
+        {% set invocation_vars = invocation_args_dict.VARS %}
+    {% else %}
+        {% set invocation_vars = fromyaml(invocation_args_dict.VARS) %}
+    {% endif %}
♻️ Duplicate comments (1)
macros/utils/table_operations/create_table_as.sql (1)

14-16: Switch to dbt.commit() for adapter portability.

adapter.commit() still raises on engines without explicit transaction support (BigQuery, Spark family), so any call with should_commit=true will blow up there. dbt.commit() wraps the adapter capability check for us—please swap to it here.

-  {% if should_commit %}
-    {% do adapter.commit() %}    
-  {% endif %}
+  {% if should_commit %}
+    {% do dbt.commit() %}
+  {% endif %}
🧹 Nitpick comments (10)
.gitignore (1)

28-28: Drop the duplicate ignore entry.

dbt_internal_packages/ is already ignored at Line 3, so this repeat line is redundant. Please remove it to keep the list tidy.

integration_tests/tests/test_disable_samples_config.py (1)

141-164: Clarify the test's purpose in the docstring.

The docstring states "Test that disable_test_samples excludes only the disabled column", but the test actually verifies that disable_test_samples: True prevents all sampling (asserting len(samples) == 0). The disable_test_samples configuration is test-level, not column-level.

Consider updating the docstring to accurately reflect the test's behavior:

-    """Test that disable_test_samples excludes only the disabled column"""
+    """Test that disable_test_samples prevents all sampling when set to True"""
integration_tests/tests/conftest.py (1)

165-176: LGTM! Consider refactoring error message.

The fixture correctly validates the runner method and target compatibility. The static analysis tool suggests moving the long error message outside the exception class for better maintainability.

Apply this diff to address the Ruff hint:

+FUSION_UNSUPPORTED_TARGET_ERROR = "Fusion runner is not supported for target: {}"
+
 @pytest.fixture(scope="session")
 def runner_method(request, target: str) -> Optional[RunnerMethod]:
     runner_method_str = request.config.getoption("--runner-method")
     if runner_method_str:
         runner_method = RunnerMethod(runner_method_str)
         if (
             runner_method == RunnerMethod.FUSION
             and target not in DBT_FUSION_SUPPORTED_TARGETS
         ):
-            raise ValueError(f"Fusion runner is not supported for target: {target}")
+            raise ValueError(FUSION_UNSUPPORTED_TARGET_ERROR.format(target))
         return runner_method
     return None
.github/workflows/test-all-warehouses.yml (1)

59-66: Prevent duplicate matrix jobs; hardcode fusion in include.

Using "${{ inputs.dbt-version || 'fusion' }}" here will mirror any provided input version and duplicate jobs already produced by the main axis. If the goal is “always add Fusion runs,” set dbt-version to the literal fusion.

-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: fusion
             warehouse-type: snowflake
-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: fusion
             warehouse-type: bigquery
-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: fusion
             warehouse-type: redshift
-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: fusion
             warehouse-type: databricks_catalog
macros/edr/dbt_artifacts/upload_dbt_tests.sql (3)

61-63: Meta precedence: confirm intended override order (node meta > config meta)

You build unified_meta by updating with config_meta_dict, then meta_dict, so node meta overrides config meta. Confirm this is desired for tests in Fusion and classic. If config meta should take precedence, reverse the update order.


65-72: Avoid persisting description inside meta to prevent duplication

Currently, description is popped from unified_meta only when sourced from unified_meta. If dbt_version >= 1.9 and node_dict.description is used, any description lingering in unified_meta will still be written to meta, duplicating the value in two fields.

Pop description from unified_meta unconditionally after selecting the final description:

     {% set description = none %}
     {% if dbt_version >= '1.9.0' and node_dict.get('description') %}
         {% set description = node_dict.get('description') %}
-    {% elif unified_meta.get('description') %}
-        {% set description = unified_meta.pop('description') %}
+    {% elif unified_meta.get('description') %}
+        {% set description = unified_meta.pop('description') %}
     {% elif default_description %}
         {% set description = default_description %}
     {% endif %}
+    {# Ensure meta doesn't redundantly store description #}
+    {% if 'description' in unified_meta %}
+        {% do unified_meta.pop('description') %}
+    {% endif %}

Also applies to: 68-69


76-79: Normalize meta.tags to a list before union

meta.tags might be provided as a string. Normalize to a list to keep union_lists robust.

-    {% set meta_tags = elementary.safe_get_with_default(unified_meta, 'tags', []) %}
-    {% set tags = elementary.union_lists(config_tags, global_tags) %}
-    {% set tags = elementary.union_lists(tags, meta_tags) %}
+    {% set raw_meta_tags = elementary.safe_get_with_default(unified_meta, 'tags', []) %}
+    {% set meta_tags = [] %}
+    {% if raw_meta_tags %}
+      {% if raw_meta_tags is string %}
+        {% set meta_tags = [raw_meta_tags] %}
+      {% elif raw_meta_tags is sequence %}
+        {% set meta_tags = raw_meta_tags %}
+      {% endif %}
+    {% endif %}
+    {% set tags = elementary.union_lists(config_tags, global_tags) %}
+    {% set tags = elementary.union_lists(tags, meta_tags) %}
macros/utils/run_queries/run_query.sql (1)

23-33: Snowflake semicolon strip is appropriate

Trimming and removing a trailing semicolon avoids Fusion issues. Consider applying similar trailing-semicolon stripping in the default formatter for consistency across adapters.

macros/utils/run_queries/agate_to_dicts.sql (1)

22-46: Decimal-to-float precision loss risk

Converting non-integer Decimals to float can lose precision. Prefer preserving as string (or keep Decimal for downstream) to avoid data drift in artifacts.

Apply this change to serialize non-integer decimals as strings:

{% macro edr_serialize_decimal(val) %}
   {% set dec_tuple = val.normalize().as_tuple() %}
 
   {# A hacky way to standardize Decimals which are not JSON-serializable #}
   {% if dec_tuple[2] == 0 %}
     {% do return(val | string | int) %}
   {% else %}
-    {% do return(val | string | float) %}
+    {# preserve precision #}
+    {% do return(val | string) %}
   {% endif %}
{% endmacro %}
macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1)

109-118: Selector regex: support “--selector=val” and quoted values

Current pattern matches only “--selector ”. Support equals form and quotes.

-{% set match = modules.re.search(
-    "--selector\s+(\S+)",
-    invocation_args_dict.INVOCATION_COMMAND
-) %}
+{% set match = modules.re.search(
+    r"--selector(?:=|\s+)(?:['\"])?(?P<sel>[^'\s\"]+)",
+    invocation_args_dict.INVOCATION_COMMAND
+) %}
-{% if match %}
-    {% do return(match.group(1)) %}
+{% if match %}
+    {% do return(match.group('sel')) %}
 {% else %}
     {% do return(none) %}
 {% endif %}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7f84089 and 9fe3c53.

📒 Files selected for processing (75)
  • .github/workflows/test-all-warehouses.yml (1 hunks)
  • .github/workflows/test-warehouse.yml (1 hunks)
  • .gitignore (1 hunks)
  • integration_tests/dbt_project/.gitignore (1 hunks)
  • integration_tests/dbt_project/dbt_project.yml (0 hunks)
  • integration_tests/dbt_project/macros/create_all_types_table.sql (4 hunks)
  • integration_tests/dbt_project/macros/materializations.sql (1 hunks)
  • integration_tests/dbt_project/macros/replace_empty_strings_with_nulls.sql (1 hunks)
  • integration_tests/dbt_project/models/exposures.yml (2 hunks)
  • integration_tests/dbt_project/models/schema.yml (2 hunks)
  • integration_tests/dbt_project/packages.yml (1 hunks)
  • integration_tests/deprecated_tests/dbt_project.yml (0 hunks)
  • integration_tests/deprecated_tests/macros/system/materializations.sql (1 hunks)
  • integration_tests/tests/conftest.py (6 hunks)
  • integration_tests/tests/dbt_project.py (8 hunks)
  • integration_tests/tests/env.py (1 hunks)
  • integration_tests/tests/pytest.ini (1 hunks)
  • integration_tests/tests/test_all_columns_anomalies.py (3 hunks)
  • integration_tests/tests/test_anomaly_test_configuration.py (1 hunks)
  • integration_tests/tests/test_collect_metrics.py (1 hunks)
  • integration_tests/tests/test_column_anomalies.py (1 hunks)
  • integration_tests/tests/test_column_pii_sampling.py (1 hunks)
  • integration_tests/tests/test_dbt_artifacts/test_artifacts.py (2 hunks)
  • integration_tests/tests/test_dbt_artifacts/test_columns.py (1 hunks)
  • integration_tests/tests/test_dbt_artifacts/test_groups.py (9 hunks)
  • integration_tests/tests/test_dimension_anomalies.py (1 hunks)
  • integration_tests/tests/test_disable_samples_config.py (6 hunks)
  • integration_tests/tests/test_exposure_schema_validity.py (7 hunks)
  • integration_tests/tests/test_failed_row_count.py (4 hunks)
  • integration_tests/tests/test_jsonschema.py (1 hunks)
  • integration_tests/tests/test_long_strings.py (3 hunks)
  • integration_tests/tests/test_python.py (1 hunks)
  • integration_tests/tests/test_volume_anomalies.py (1 hunks)
  • macros/edr/dbt_artifacts/upload_dbt_artifacts.sql (1 hunks)
  • macros/edr/dbt_artifacts/upload_dbt_invocation.sql (4 hunks)
  • macros/edr/dbt_artifacts/upload_dbt_models.sql (1 hunks)
  • macros/edr/dbt_artifacts/upload_dbt_tests.sql (5 hunks)
  • macros/edr/materializations/test/test.sql (1 hunks)
  • macros/edr/system/configuration/is_elementary_enabled.sql (1 hunks)
  • macros/edr/system/hooks/on_run_end.sql (1 hunks)
  • macros/edr/system/hooks/on_run_start.sql (1 hunks)
  • macros/edr/system/system_utils/get_config_var.sql (0 hunks)
  • macros/edr/system/system_utils/get_pii_columns_from_parent_model.sql (1 hunks)
  • macros/edr/system/system_utils/get_test_model.sql (1 hunks)
  • macros/edr/system/system_utils/is_dbt_fusion.sql (1 hunks)
  • macros/edr/tests/on_run_end/handle_tests_results.sql (4 hunks)
  • macros/edr/tests/on_run_start/ensure_materialize_override.sql (0 hunks)
  • macros/edr/tests/test_ai_data_validation.sql (1 hunks)
  • macros/edr/tests/test_all_columns_anomalies.sql (2 hunks)
  • macros/edr/tests/test_collect_metrics.sql (1 hunks)
  • macros/edr/tests/test_column_anomalies.sql (2 hunks)
  • macros/edr/tests/test_dimension_anomalies.sql (2 hunks)
  • macros/edr/tests/test_event_freshness_anomalies.sql (1 hunks)
  • macros/edr/tests/test_exposure_schema_validity.sql (1 hunks)
  • macros/edr/tests/test_schema_changes.sql (2 hunks)
  • macros/edr/tests/test_schema_changes_from_baseline.sql (2 hunks)
  • macros/edr/tests/test_table_anomalies.sql (2 hunks)
  • macros/edr/tests/test_utils/create_elementary_test_table.sql (1 hunks)
  • macros/edr/tests/test_utils/get_model_relation_for_test.sql (1 hunks)
  • macros/edr/tests/test_utils/validate_unique_metric_names.sql (1 hunks)
  • macros/utils/cross_db_utils/get_user_creation_query.sql (1 hunks)
  • macros/utils/cross_db_utils/incremental_strategy.sql (1 hunks)
  • macros/utils/data_types/normalize_data_type.sql (1 hunks)
  • macros/utils/graph/get_result_node.sql (1 hunks)
  • macros/utils/graph/get_run_result_dict.sql (2 hunks)
  • macros/utils/run_queries/agate_to_dict.sql (1 hunks)
  • macros/utils/run_queries/agate_to_dicts.sql (1 hunks)
  • macros/utils/run_queries/run_query.sql (1 hunks)
  • macros/utils/table_operations/create_intermediate_relation.sql (1 hunks)
  • macros/utils/table_operations/create_or_replace.sql (1 hunks)
  • macros/utils/table_operations/create_table_as.sql (1 hunks)
  • macros/utils/table_operations/create_table_like.sql (1 hunks)
  • macros/utils/table_operations/create_temp_table.sql (1 hunks)
  • macros/utils/table_operations/replace_table_data.sql (1 hunks)
  • package-lock.yml (1 hunks)
💤 Files with no reviewable changes (4)
  • integration_tests/dbt_project/dbt_project.yml
  • macros/edr/tests/on_run_start/ensure_materialize_override.sql
  • macros/edr/system/system_utils/get_config_var.sql
  • integration_tests/deprecated_tests/dbt_project.yml
✅ Files skipped from review due to trivial changes (2)
  • macros/edr/materializations/test/test.sql
  • package-lock.yml
🚧 Files skipped from review as they are similar to previous changes (14)
  • macros/edr/tests/test_utils/validate_unique_metric_names.sql
  • macros/utils/table_operations/replace_table_data.sql
  • macros/edr/tests/test_dimension_anomalies.sql
  • integration_tests/dbt_project/packages.yml
  • macros/edr/system/system_utils/is_dbt_fusion.sql
  • integration_tests/tests/env.py
  • macros/edr/system/system_utils/get_test_model.sql
  • macros/utils/run_queries/agate_to_dict.sql
  • macros/utils/table_operations/create_table_like.sql
  • integration_tests/dbt_project/models/exposures.yml
  • integration_tests/dbt_project/macros/create_all_types_table.sql
  • macros/utils/table_operations/create_or_replace.sql
  • macros/edr/tests/test_schema_changes_from_baseline.sql
  • integration_tests/dbt_project/.gitignore
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
📚 Learning: 2025-09-04T09:14:40.621Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/edr/tests/test_dimension_anomalies.sql:73-75
Timestamp: 2025-09-04T09:14:40.621Z
Learning: In the dbt-data-reliability codebase, `flatten_model` is a valid macro defined in `macros/edr/dbt_artifacts/upload_dbt_models.sql` and can be used as a parameter to `elementary.get_anomaly_query()` even though other test files use the `flattened_test` variable instead.

Applied to files:

  • macros/edr/dbt_artifacts/upload_dbt_models.sql
  • macros/edr/tests/test_column_anomalies.sql
  • macros/edr/tests/test_table_anomalies.sql
  • macros/edr/tests/test_all_columns_anomalies.sql
📚 Learning: 2025-07-22T15:41:57.338Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.

Applied to files:

  • macros/edr/system/hooks/on_run_end.sql
  • macros/utils/table_operations/create_table_as.sql
📚 Learning: 2025-07-15T19:28:35.548Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/run_queries/agate_to_dict.sql:1-7
Timestamp: 2025-07-15T19:28:35.548Z
Learning: In dbt-data-reliability, the `agate_to_dict` macro in `macros/utils/run_queries/agate_to_dict.sql` intentionally returns different types based on fusion/non-fusion mode to handle breaking changes in dbt Fusion. In fusion mode, it returns the original `agate_table`, while in non-fusion mode it returns `agate_table.to_dict()`. This is because the content/structure of `agate_table` is different between the two modes.

Applied to files:

  • macros/utils/run_queries/agate_to_dicts.sql
🧬 Code graph analysis (15)
integration_tests/tests/test_long_strings.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_python.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_dimension_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
  • test (110-128)
  • test (131-149)
  • test (151-257)
integration_tests/tests/test_dbt_artifacts/test_artifacts.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_all_columns_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
  • test (110-128)
  • test (131-149)
  • test (151-257)
integration_tests/tests/test_failed_row_count.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_exposure_schema_validity.py (2)
integration_tests/tests/conftest.py (3)
  • skip_for_dbt_fusion (123-126)
  • dbt_project (144-147)
  • target (151-152)
integration_tests/tests/dbt_project.py (4)
  • DbtProject (46-334)
  • test (110-128)
  • test (131-149)
  • test (151-257)
integration_tests/tests/test_jsonschema.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_anomaly_test_configuration.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_collect_metrics.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/test_volume_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
  • test (110-128)
  • test (131-149)
  • test (151-257)
integration_tests/tests/test_column_anomalies.py (1)
integration_tests/tests/dbt_project.py (3)
  • test (110-128)
  • test (131-149)
  • test (151-257)
integration_tests/tests/dbt_project.py (1)
integration_tests/tests/conftest.py (2)
  • target (151-152)
  • runner_method (166-176)
integration_tests/tests/test_dbt_artifacts/test_groups.py (1)
integration_tests/tests/conftest.py (1)
  • skip_for_dbt_fusion (123-126)
integration_tests/tests/conftest.py (2)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/env.py (1)
  • Environment (7-26)
🪛 Ruff (0.13.2)
integration_tests/tests/conftest.py

174-174: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_official, redshift) / test
🔇 Additional comments (66)
integration_tests/tests/test_column_pii_sampling.py (1)

373-373: Validate nested meta tags support or update test
I didn’t find any implementation references to config.meta.tags; ensure the code supports nested meta tags or update this test to use config.tags.

integration_tests/dbt_project/macros/materializations.sql (1)

1-20: LGTM! Simplified macro invocation pattern.

Replacing .call_macro() with direct function calls is a cleaner approach and aligns with Fusion compatibility requirements. The logic and control flow remain unchanged.

macros/edr/tests/test_utils/get_model_relation_for_test.sql (1)

7-15: LGTM! Fusion-aware early-exit path for single-dependency tests.

The new logic correctly checks for a single dependency in test_node.depends_on.nodes, retrieves the node, derives the relation, and returns it if found. This improves reliability and aligns with Fusion's model resolution needs. The fallback to the existing string-based logic is preserved.

integration_tests/tests/test_column_anomalies.py (1)

115-138: LGTM! Cleaner test configuration with explicit test_config.

The updates replace mutable parameter dictionaries with direct use of DBT_TEST_ARGS and introduce test_config for where-clause filtering. This approach is more explicit and aligns with the broader test infrastructure changes.

integration_tests/deprecated_tests/macros/system/materializations.sql (1)

1-8: LGTM! Consistent with updated invocation pattern.

Direct function calls replace .call_macro() invocations, matching the pattern applied elsewhere. The logic is unchanged and the approach is cleaner.

integration_tests/tests/test_volume_anomalies.py (1)

77-100: LGTM! Explicit test configuration aligns with updated test API.

The changes use DBT_TEST_ARGS directly and pass test_config for where-clause filtering, matching the pattern applied in related tests. This approach is cleaner and more maintainable.

integration_tests/tests/test_dimension_anomalies.py (1)

147-166: LGTM! Consistent test configuration pattern.

The updates use DBT_TEST_ARGS directly and introduce test_config for where-clause filtering, aligning with the pattern applied across the test suite. The logic is clear and correct.

.github/workflows/test-warehouse.yml (4)

133-137: LGTM! Fusion installation step is correct.

The remote installer invocation is standard for Fusion setup and aligns with the conditional logic guarding the step.


150-156: LGTM! Fusion-specific dependency handling is correct.

The conditional installation of dependencies, symlinking of the Elementary package, and requirements installation for Fusion are logically sound and correctly scoped.


160-160: LGTM! Fusion-specific dbt binary path is correct.

The conditional path selection (~/.local/bin/dbt for Fusion, dbt otherwise) is correct and aligns with Fusion's installation location.


164-164: LGTM! Fusion runner-method flag is correct.

The conditional --runner-method fusion flag is correctly applied only when Fusion is the selected dbt version.

integration_tests/tests/dbt_project.py (5)

34-43: LGTM! Correct runner_method plumbing.

The get_dbt_runner function now accepts and forwards runner_method to create_dbt_runner, enabling Fusion-specific runner selection. The logic is correct.


47-56: LGTM! DbtProject correctly stores runner_method.

The constructor now accepts and stores runner_method, and passes it to get_dbt_runner. This enables Fusion-specific behavior downstream.


188-201: LGTM! Correct test_config integration.

The test_yaml construction now includes an optional config block from test_config, and the structure is correctly applied to both table-level and column-level test paths. The logic is sound.


265-272: LGTM! Reasonable workaround for Fusion BigQuery seed quirk.

The _fix_seed_if_needed method applies a Fusion-specific fix for BigQuery's seed behavior (empty strings instead of nulls). The conditional check is correct and the workaround is scoped appropriately.


330-334: LGTM! Safer cleanup with try/finally.

Wrapping path.unlink() in a finally block ensures the temporary file is cleaned up even if an exception occurs during the yield. This is a good reliability improvement.

integration_tests/tests/conftest.py (8)

16-21: LGTM!

The constant clearly defines supported Fusion targets.


30-30: LGTM!

The CLI option is properly defined.


34-49: LGTM!

The fixture properly threads runner_method through and invokes the new helper functions to prepare the test environment.


52-67: Verify packages.yml structure assumptions.

The code assumes packages_yml["packages"][0]["local"] exists. If the packages.yml file is malformed or missing the expected structure, this will raise a KeyError or IndexError.

Consider adding defensive checks or documenting the expected structure:

     packages_yml_path = Path(project_dir_copy) / "packages.yml"
     with packages_yml_path.open("r") as packages_yml_file:
         packages_yml = yaml.safe_load(packages_yml_file)
 
+    if not packages_yml.get("packages") or not packages_yml["packages"]:
+        raise ValueError("packages.yml must contain a non-empty 'packages' list")
+    if "local" not in packages_yml["packages"][0]:
+        raise ValueError("First package must have a 'local' key")
+
     packages_yml["packages"][0]["local"] = str(
         (DBT_PROJECT_PATH / packages_yml["packages"][0]["local"]).resolve()
     )

70-80: LGTM!

The function correctly removes Python test files only when the runner method is Fusion.


84-90: LGTM!

The fixture properly passes runner_method to the Environment initialization.


122-126: LGTM!

The autouse fixture correctly implements the skip logic for Fusion tests.


144-147: LGTM!

The fixture properly passes runner_method to DbtProject initialization.

integration_tests/tests/test_all_columns_anomalies.py (2)

62-62: LGTM!

The function name now accurately reflects that it tests the where parameter.


96-128: LGTM!

The test correctly uses test_config to pass the where clause, aligning with the updated test infrastructure pattern.

integration_tests/tests/test_dbt_artifacts/test_columns.py (1)

62-62: LGTM!

Changing to set equality makes the assertion order-insensitive, which is more robust and appropriate when the order of columns doesn't matter.

integration_tests/tests/pytest.ini (1)

6-6: LGTM!

The marker is properly declared for use across the test suite.

integration_tests/tests/test_long_strings.py (2)

1-1: LGTM!

The pytest import is necessary for the skip marker.


21-21: LGTM!

The skip markers are correctly applied to exclude these tests from Fusion runs.

Also applies to: 40-40

integration_tests/tests/test_collect_metrics.py (1)

153-153: LGTM!

The skip marker is correctly applied to exclude this test from Fusion runs.

integration_tests/tests/test_anomaly_test_configuration.py (1)

117-117: LGTM!

The skip marker is correctly applied to exclude this test from Fusion runs.

integration_tests/tests/test_python.py (1)

13-13: LGTM!

The class-level skip marker appropriately excludes all Python tests from Fusion runs, as Python tests are not supported in Fusion.

macros/edr/tests/test_collect_metrics.sql (1)

29-29: LGTM! Test model resolution updated for Fusion compatibility.

The change from context["model"] to elementary.get_test_model() aligns with the PR's broader effort to support dbt Fusion. The abstraction ensures consistent test model resolution across different execution contexts.

integration_tests/tests/test_jsonschema.py (1)

17-17: LGTM! Fusion skip marker applied correctly.

The @pytest.mark.skip_for_dbt_fusion decorator correctly gates the entire TestJsonschema class from executing under Fusion contexts, aligning with the project-wide Fusion skip mechanism.

macros/edr/tests/test_column_anomalies.sql (2)

4-4: LGTM! Model relation resolution updated for Fusion.

Consistent with the PR-wide pattern, replacing context["model"] with elementary.get_test_model() ensures proper test model resolution in Fusion environments.


95-95: LGTM! Test flattening updated for Fusion compatibility.

The flatten_test call now uses elementary.get_test_model() instead of context["model"], aligning with the broader Fusion support changes.

Based on learnings

integration_tests/tests/test_failed_row_count.py (4)

9-9: LGTM! Fusion skip applied consistently.

The @pytest.mark.skip_for_dbt_fusion marker is correctly applied, ensuring this test skips under Fusion contexts alongside the existing ClickHouse skip.


27-27: LGTM! Fusion skip marker applied.

Correctly gates test_sum_failed_row_count from Fusion execution.


47-47: LGTM! Fusion skip applied consistently.

The decorator is correctly positioned to skip this test under Fusion, complementing the existing ClickHouse skip.


67-67: LGTM! Fusion skip marker applied.

Correctly gates test_warn_if_0 from Fusion execution.

macros/edr/tests/test_exposure_schema_validity.sql (1)

19-19: LGTM! Model resolution updated for Fusion compatibility.

The change to elementary.get_test_model() ensures proper test model resolution in Fusion environments, consistent with other test macros in this PR.

integration_tests/tests/test_exposure_schema_validity.py (1)

12-12: LGTM! Fusion skip markers applied consistently.

All eight exposure schema validity test functions are correctly decorated with @pytest.mark.skip_for_dbt_fusion, ensuring they skip under Fusion contexts.

macros/edr/tests/test_ai_data_validation.sql (1)

4-4: LGTM! Model resolution updated for Fusion compatibility.

The change to elementary.get_test_model() aligns with the PR-wide pattern for supporting dbt Fusion.

macros/edr/tests/test_schema_changes.sql (2)

4-4: LGTM! Model relation resolution updated for Fusion.

The change to elementary.get_test_model() ensures proper test model resolution in Fusion environments.


33-33: LGTM! Test flattening updated for Fusion compatibility.

The flatten_test call now uses elementary.get_test_model(), aligning with the broader Fusion support changes.

integration_tests/tests/test_dbt_artifacts/test_artifacts.py (2)

99-99: LGTM!

The skip_for_dbt_fusion marker is correctly applied to skip this test for Fusion runtime, where source freshness results are not yet supported.


119-127: LGTM!

The config structure has been correctly updated to nest loaded_at_field and freshness under a config block, aligning with dbt's standard configuration patterns for sources.

macros/edr/tests/test_table_anomalies.sql (2)

4-4: LGTM!

The test model resolution has been correctly updated to use the centralized elementary.get_test_model() helper, improving Fusion compatibility and making test model access explicit.


85-85: LGTM!

The flatten_test call has been correctly updated to use elementary.get_test_model(), maintaining consistency with the broader refactor pattern across test macros.

macros/edr/tests/test_event_freshness_anomalies.sql (1)

4-4: LGTM!

The test model resolution has been correctly updated to use elementary.get_test_model(), maintaining consistency with the PR-wide refactor for Fusion compatibility.

integration_tests/tests/test_dbt_artifacts/test_groups.py (2)

91-91: LGTM!

The skip_for_dbt_fusion markers are selectively applied to tests that are not yet supported in Fusion runtime (model groups, two groups, and snapshot groups), while other artifact tests (test, seed) remain enabled for Fusion.

Also applies to: 166-166, 462-462


112-114: LGTM!

The config structure has been consistently updated across all artifact types (models, seeds, snapshots) to nest group under a config block, aligning with dbt's standard configuration patterns.

Also applies to: 188-190, 195-197, 274-276, 347-349, 413-415, 487-489

macros/edr/tests/test_all_columns_anomalies.sql (1)

4-4: LGTM!

The test model resolution has been correctly updated to use elementary.get_test_model() in both get_model_relation_for_test and flatten_test calls, maintaining consistency with the PR-wide refactor for Fusion compatibility.

Also applies to: 96-96

macros/edr/dbt_artifacts/upload_dbt_artifacts.sql (1)

54-62: LGTM!

The manual aggregation correctly replaces group_by functionality to work around Fusion's agate limitations. The logic properly builds a per-model mapping of metadata hashes, and the return structure matches caller expectations with appropriate edge-case handling.

macros/edr/system/configuration/is_elementary_enabled.sql (1)

2-3: Confirm compatibility with adapters lacking databases.

elementary.get_package_database_and_schema()[0] can be None for supported adapters that only define schemas (e.g., Spark without Unity Catalog). In that case this macro would report Elementary as disabled even though it should remain active. Please confirm that the helper never returns (None, <schema>), or broaden the check to fall back to the schema as well.

.github/workflows/test-all-warehouses.yml (1)

59-66: Fusion path fully supported in test-warehouse.yml. The workflow conditionally installs dbt-fusion, selects the fusion profiles YML, and applies fusion-specific binary paths and test flags.

macros/edr/dbt_artifacts/upload_dbt_tests.sql (3)

136-137: Group resolution LGTM

Resolution order (config > node > tested model) is sensible and aligns with expected precedence.

Also applies to: 145-145


169-170: Unified meta + quality_dimension + group_name inclusion LGTM

Writing unified_meta into meta, preferring meta.quality_dimension, and adding group_name look correct.

Also applies to: 183-185


194-195: generic_tests path handling LGTM

Treating generic_tests/ as generic matches Fusion layouts.

macros/edr/dbt_artifacts/upload_dbt_models.sql (1)

71-71: Checksum guard looks good

The mapping check prevents crashes when checksum is not a dict; safe fallback to null is appropriate.

macros/utils/graph/get_run_result_dict.sql (1)

6-7: Fusion-aware serialization switch

Using elementary.agate_to_dict for run_result, timing items, and node aligns with Fusion’s differing object shapes. Callers must accept dicts in classic dbt and original objects in Fusion. Looks correct.

Also applies to: 12-13, 23-24

macros/utils/table_operations/create_intermediate_relation.sql (1)

2-12: Explicit incorporate path avoids Fusion None-path pitfalls

Creating the intermediate relation via edr_make_intermediate_relation then incorporating with an explicit path is a good cross-adapter fix.

macros/edr/tests/on_run_end/handle_tests_results.sql (2)

28-51: Status normalization is solid

Anomaly detection adjustment and Fusion status normalization (error->fail, success->pass) are sensible and minimal.


141-142: Switch to edr_create_table_as

Creating temp staging via edr_create_table_as is consistent with Fusion-safe table ops.

Also applies to: 188-190

macros/utils/table_operations/create_temp_table.sql (1)

11-12: Consolidation via edr_create_table_as

Using edr_create_table_as(True, temp_table_relation, sql_query, drop_first=temp_table_exists) correctly flags a temporary table and simplifies cross-adapter handling.

Comment on lines 56 to 62
{% for result in results %}
{% set result = elementary.get_run_result_dict(result) %}
{% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %}
{% if node.resource_type == 'test' %}
{% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %}

Copy link

@coderabbitai coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Crash risk: unsafe unique_id/node access

If result.node is None (common on some failures) and result.unique_id is not present (>=1.8 path), accessing result.node.unique_id will raise. Also the computed node variable isn’t used for cache lookups later.

Use guarded resolution and rely on node variable:

-{% set result = elementary.get_run_result_dict(result) %}
-{% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %}
+{% set result = elementary.get_run_result_dict(result) %}
+{# Prefer node from result if present; otherwise resolve by unique_id if available #}
+{% set node = result.node if result.node else none %}
+{% if not node and result.unique_id is defined and result.unique_id %}
+  {% set node = elementary.get_node(result.unique_id) %}
+{% endif %}
+{% if not node %}
+  {# As a last resort, skip non-resolvable results safely #}
+  {% continue %}
+{% endif %}

Then below, use node.unique_id for cache access (see next comment).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% for result in results %}
{% set result = elementary.get_run_result_dict(result) %}
{% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %}
{% if node.resource_type == 'test' %}
{% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %}
{% for result in results %}
{% set result = elementary.get_run_result_dict(result) %}
{# Prefer node from result if present; otherwise resolve by unique_id if available #}
{% set node = result.node if result.node else none %}
{% if not node and result.unique_id is defined and result.unique_id %}
{% set node = elementary.get_node(result.unique_id) %}
{% endif %}
{% if not node %}
{# As a last resort, skip non-resolvable results safely #}
{% continue %}
{% endif %}
{% if node.resource_type == 'test' %}
{% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %}
🤖 Prompt for AI Agents
In macros/edr/tests/on_run_end/handle_tests_results.sql around lines 56 to 62,
the code directly accesses result.node.unique_id which will crash if result.node
is None and result.unique_id is missing; change the node resolution to a guarded
form (e.g., set node = result.node or elementary.get_node(result.unique_id or
result.node.unique_id)) but simpler and safer: resolve node once with node =
result.node or elementary.get_node(result.unique_id), ensure node is not None
before using it, and then use node.unique_id for all subsequent cache lookups
(cached_elementary_test_results and cached_elementary_test_failed_row_counts)
instead of accessing result.node.unique_id.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what I saw this never happens

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haritamar, thanks for clarifying! If you've verified through testing that result.node is always populated after get_run_result_dict, then the crash scenario is indeed unlikely.

That said, there's still a consistency issue in the code: you compute the node variable but then still use result.node.unique_id for the cache lookups a few lines later:

{% set elementary_test_results_rows = cached_elementary_test_results.get(result.node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(result.node.unique_id) %}

For cleaner, more maintainable code, consider using node.unique_id consistently since you've already resolved the node:

{% set elementary_test_results_rows = cached_elementary_test_results.get(node.unique_id) %}
{% set elementary_test_failed_row_count = cached_elementary_test_failed_row_counts.get(node.unique_id) %}

This makes the intent clearer and ensures you're using the same node reference throughout.

Comment on lines +13 to +15
{%- macro redshift__get_default_incremental_strategy() %}
{% do return("merge") %}
{% endmacro %}
Copy link

@coderabbitai coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Revert Redshift default to avoid forced merge without unique_key.

Switching the Redshift default to merge forces every incremental model using our helpers to supply a unique_key; dbt-redshift fails compilation at runtime if merge is chosen without one. Today those same models succeed because the adapter default (delete+insert) does not require a key. This change would therefore break existing Redshift incremental models that rely on the default path. Please keep the default at delete+insert, or only dispatch to merge when a unique_key is present.

🤖 Prompt for AI Agents
In macros/utils/cross_db_utils/incremental_strategy.sql around lines 13 to 15,
the macro currently returns "merge" unconditionally which forces Redshift
incremental models to require a unique_key and breaks existing models; change
the macro to return "delete+insert" as the default for Redshift, or implement
conditional logic to return "merge" only when a unique_key is provided
(otherwise return "delete+insert") so existing models without unique_key
continue to compile and run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this since these method are used exclusively for our dbt artifact tables which all have unique keys (also the default for other adapters is merge in this file)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haritamar, thank you for the clarification! That resolves my concern. Since get_default_incremental_strategy() is used exclusively for your internal artifact tables and they all have unique keys defined, switching Redshift to merge is safe and consistent with the other adapters in this file. I appreciate the context!


✏️ Learnings added
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/cross_db_utils/incremental_strategy.sql:13-15
Timestamp: 2025-10-05T16:25:01.698Z
Learning: In macros/utils/cross_db_utils/incremental_strategy.sql, the get_default_incremental_strategy() macro and its adapter-specific variants are used exclusively for Elementary's internal dbt artifact tables, all of which have unique keys defined. Therefore, defaulting to "merge" strategy for Redshift (and other adapters) is safe in this context.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +3 to +10
{% set node = elementary.get_node(result.unique_id) %}
{% if node.identifier == identifier %}
{% if package_name %}
{% if result.node.package_name == package_name %}
{{ return(result.node) }}
{% if node.package_name == package_name %}
{{ return(node) }}
{% endif %}
{% else %}
{{ return(result.node) }}
{{ return(node) }}
Copy link

@coderabbitai coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard against missing nodes before dereferencing

elementary.get_node can return none when the manifest does not have that unique_id (e.g., disabled/ephemeral nodes), so hitting Line 4 will raise 'None' has no attribute 'identifier'. Add a guard to skip those cases before touching attributes.

     {% set node = elementary.get_node(result.unique_id) %}
+    {% if node is none %}
+      {% continue %}
+    {% endif %}
     {% if node.identifier == identifier %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% set node = elementary.get_node(result.unique_id) %}
{% if node.identifier == identifier %}
{% if package_name %}
{% if result.node.package_name == package_name %}
{{ return(result.node) }}
{% if node.package_name == package_name %}
{{ return(node) }}
{% endif %}
{% else %}
{{ return(result.node) }}
{{ return(node) }}
{% set node = elementary.get_node(result.unique_id) %}
{% if node is none %}
{% continue %}
{% endif %}
{% if node.identifier == identifier %}
{% if package_name %}
{% if node.package_name == package_name %}
{{ return(node) }}
{% endif %}
{% else %}
{{ return(node) }}
🤖 Prompt for AI Agents
In macros/utils/graph/get_result_node.sql around lines 3 to 10, the template
dereferences node.identifier without ensuring elementary.get_node(...) returned
a node; when get_node returns none this raises an attribute error. Fix by
guarding for node existence before any attribute access (e.g., check node is not
none / truthy in the same conditional), then perform the identifier and optional
package_name comparisons and return only when node is present and matches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it shouldn't ever happen in this context, I prefer to fail violently here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haritamar, understood! I'll defer to your judgment on this. The explicit failure will make it immediately clear if something unexpected happens during development or testing.


✏️ Learnings added
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/graph/get_result_node.sql:3-10
Timestamp: 2025-10-05T16:26:50.625Z
Learning: In macros/utils/graph/get_result_node.sql, the maintainer prefers to fail violently (let exceptions propagate) rather than guard against none when elementary.get_node returns none, as it should never happen in that context.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

{% set config = elementary.get_runtime_config() %}
{%- if config.vars -%}
{% do all_vars.update(config.vars.to_dict()) %}
{% do all_vars.update(elementary.agate_to_dict(config.vars)) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.vars is an agate? in regular dbt as well?

Copy link
Collaborator Author

@haritamar haritamar Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's not, my mistake
It's a dbt object called VarProvider which has a to_dict method, and there are other similar objects.
I'll rename the macro

'unique_id': node_dict.get('unique_id'),
'alias': node_dict.get('alias'),
'checksum': checksum_dict.get('checksum'),
'checksum': checksum_dict.get('checksum') if checksum_dict is mapping else none,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is it in fusion?

Copy link
Collaborator Author

@haritamar haritamar Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I don't remember but it looks like it's a dict now in fusion so fixing this

{% set test_query = elementary.get_compiled_code(flattened_test) %}
{% set test_query_lower = test_query.lower() %}

{% set test_query_lower = sql.lower() %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont understand this change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compiled code doesn't currently exist in the graph in dbt-fusion
ticket
but sql in this context is effectively the compiled code so it allows us to keep supporting the pii feature.

package-lock.yml Outdated
@@ -0,0 +1,5 @@
packages:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not commit the package-lock file

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
macros/edr/dbt_artifacts/upload_dbt_invocation.sql (1)

90-100: Uppercase SELECT path never runs

The new invocation_args_dict.SELECT branch is wrapped inside an outer guard that already requires invocation_args_dict.select to be truthy, so we’ll never reach it when only the uppercase key exists (Fusion case). This leaves Fusion without selector coverage. Please restructure the condition (e.g., check if invocation_args_dict first, then branch on both keys) so the uppercase path actually executes.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9fe3c53 and 8bbf168.

📒 Files selected for processing (5)
  • .gitignore (1 hunks)
  • integration_tests/tests/test_exposure_schema_validity.py (7 hunks)
  • macros/edr/dbt_artifacts/upload_dbt_invocation.sql (4 hunks)
  • macros/utils/graph/get_run_result_dict.sql (2 hunks)
  • macros/utils/run_queries/dbt_object_to_dict.sql (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .gitignore
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
📚 Learning: 2025-07-15T19:28:35.548Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/run_queries/agate_to_dict.sql:1-7
Timestamp: 2025-07-15T19:28:35.548Z
Learning: In dbt-data-reliability, the `agate_to_dict` macro in `macros/utils/run_queries/agate_to_dict.sql` intentionally returns different types based on fusion/non-fusion mode to handle breaking changes in dbt Fusion. In fusion mode, it returns the original `agate_table`, while in non-fusion mode it returns `agate_table.to_dict()`. This is because the content/structure of `agate_table` is different between the two modes.

Applied to files:

  • macros/utils/run_queries/dbt_object_to_dict.sql
🧬 Code graph analysis (1)
integration_tests/tests/test_exposure_schema_validity.py (2)
integration_tests/tests/conftest.py (3)
  • skip_for_dbt_fusion (123-126)
  • dbt_project (144-147)
  • target (151-152)
integration_tests/tests/dbt_project.py (4)
  • DbtProject (46-334)
  • test (110-128)
  • test (131-149)
  • test (151-257)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (1.8.0, postgres) / test
🔇 Additional comments (2)
integration_tests/tests/test_exposure_schema_validity.py (2)

30-30: LGTM! Skip markers correctly applied.

The @pytest.mark.skip_for_dbt_fusion decorators are appropriate given that exposure artifacts are not yet available in dbt-fusion (per past review comments and dbt-labs/dbt-fusion#13).

Also applies to: 49-49, 57-57, 90-90, 123-123, 166-166, 190-190


61-63: Good refactor: Variable rename addresses past feedback.

The rename from explicit_target_for_bigquery to target_data_type improves clarity and addresses the past review comment that the old variable name didn't make sense. The logic now handles dremio's special case (reporting string columns as "other") while other targets report them as "string".

Based on learnings

Also applies to: 73-73

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
integration_tests/tests/conftest.py (2)

52-67: Consider adding structure validation.

The function directly accesses packages_yml["packages"][0]["local"] without validating the structure exists. While this is acceptable for test code that will fail loudly if the structure is wrong, adding validation would improve robustness.

Consider adding a check:

 packages_yml_path = Path(project_dir_copy) / "packages.yml"
 with packages_yml_path.open("r") as packages_yml_file:
     packages_yml = yaml.safe_load(packages_yml_file)

+if not packages_yml.get("packages") or not isinstance(packages_yml["packages"], list) or len(packages_yml["packages"]) == 0 or "local" not in packages_yml["packages"][0]:
+    raise ValueError("packages.yml does not have the expected structure")
+
 packages_yml["packages"][0]["local"] = str(
     (DBT_PROJECT_PATH / packages_yml["packages"][0]["local"]).resolve()
 )

165-176: Consider extracting the error message to a constant.

The fixture correctly validates that dbt-fusion is only used with supported targets. However, the static analysis tool suggests avoiding long error messages inline.

As per static analysis hint, consider extracting the message:

+FUSION_UNSUPPORTED_TARGET_MSG = "Fusion runner is not supported for target: {}"
+
 @pytest.fixture(scope="session")
 def runner_method(request, target: str) -> Optional[RunnerMethod]:
     runner_method_str = request.config.getoption("--runner-method")
     if runner_method_str:
         runner_method = RunnerMethod(runner_method_str)
         if (
             runner_method == RunnerMethod.FUSION
             and target not in DBT_FUSION_SUPPORTED_TARGETS
         ):
-            raise ValueError(f"Fusion runner is not supported for target: {target}")
+            raise ValueError(FUSION_UNSUPPORTED_TARGET_MSG.format(target))
         return runner_method
     return None
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8bbf168 and e2dc737.

📒 Files selected for processing (1)
  • integration_tests/tests/conftest.py (6 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
🧬 Code graph analysis (1)
integration_tests/tests/conftest.py (3)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/env.py (1)
  • Environment (7-26)
integration_tests/tests/logger.py (1)
  • get_logger (5-11)
🪛 Ruff (0.13.3)
integration_tests/tests/conftest.py

174-174: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: test (1.8.0, postgres) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, bigquery) / test
🔇 Additional comments (8)
integration_tests/tests/conftest.py (8)

4-4: LGTM!

The new imports (Optional, yaml, RunnerMethod) are appropriate and necessary for the dbt-fusion support functionality.

Also applies to: 7-7, 10-10


16-21: LGTM!

The supported targets list aligns with the PR objectives for dbt-fusion support on Snowflake, BigQuery, Redshift, and Databricks adapters.


30-30: LGTM!

The --runner-method command-line option follows pytest conventions and enables CLI control of the runner method for tests.


34-49: LGTM!

The fixture correctly accepts runner_method and orchestrates the necessary post-copy operations for dbt-fusion compatibility. The cleanup is properly handled in the finally block.


70-80: LGTM!

The function correctly removes Python models only when using dbt-fusion, and the scope is appropriately limited to the models directory.


84-103: LGTM!

The fixture correctly accepts and passes runner_method to the Environment constructor, consistent with the changes in env.py.


122-126: LGTM!

The autouse fixture correctly implements test skipping for dbt-fusion using standard pytest marker patterns.


144-147: LGTM!

The fixture correctly accepts and passes runner_method to the DbtProject constructor, consistent with the changes in dbt_project.py.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e2dc737 and e588009.

📒 Files selected for processing (1)
  • macros/edr/dbt_artifacts/upload_dbt_invocation.sql (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (1.8.0, postgres) / test
  • GitHub Check: test (latest_official, athena) / test

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
macros/edr/tests/on_run_end/handle_tests_results.sql (1)

58-58: Guard node resolution to avoid crashes
This reintroduces the unsafe access the previous review flagged. When result.node is None and result.unique_id is falsy/absent (seen on certain dbt>=1.8 error paths), the expression still evaluates result.node.unique_id, causing an exception instead of skipping the unresolved result. Resolve the node defensively before dereferencing and bail out if you still can’t.

-    {% set node = result.node or elementary.get_node(result.unique_id or result.node.unique_id) %}
+    {% set node = result.node if result.node else none %}
+    {% if not node and result.unique_id is defined and result.unique_id %}
+      {% set node = elementary.get_node(result.unique_id) %}
+    {% endif %}
+    {% if not node %}
+      {% continue %}
+    {% endif %}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e588009 and df232d4.

📒 Files selected for processing (3)
  • .github/workflows/test-warehouse.yml (1 hunks)
  • macros/edr/tests/on_run_end/handle_tests_results.sql (4 hunks)
  • macros/utils/cross_db_utils/get_user_creation_query.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (1.8.0, postgres) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test

Comment on lines +153 to +160
${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary
pip install -r requirements.txt
- name: Check DWH connection
working-directory: ${{ env.TESTS_DIR }}
run: |
dbt debug -t "${{ inputs.warehouse-type }}"
${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
Copy link

@coderabbitai coderabbitai bot Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix dbt‑fusion binary path

install.sh drops the fusion CLI at ~/.dbt/bin/dbt, not ~/.local/bin/dbt. With the current hard‑coded path, both deps and debug will fail at runtime (command not found). Please point the conditional to the actual install location (or dynamically resolve the path) before we merge.

-          ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
+          ${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
@@
-          ${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
+          ${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary
pip install -r requirements.txt
- name: Check DWH connection
working-directory: ${{ env.TESTS_DIR }}
run: |
dbt debug -t "${{ inputs.warehouse-type }}"
${{ (inputs.dbt-version == 'fusion' && '~/.local/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} deps --project-dir dbt_project
ln -sfn ${{ github.workspace }}/dbt-data-reliability dbt_project/dbt_packages/elementary
pip install -r requirements.txt
- name: Check DWH connection
working-directory: ${{ env.TESTS_DIR }}
run: |
${{ (inputs.dbt-version == 'fusion' && '~/.dbt/bin/dbt') || 'dbt' }} debug -t "${{ inputs.warehouse-type }}"
🤖 Prompt for AI Agents
In .github/workflows/test-warehouse.yml around lines 153 to 160, the workflow
hard-codes the dbt‑fusion CLI path as ~/.local/bin/dbt but install.sh places the
fusion binary at ~/.dbt/bin/dbt; update both conditional uses (the deps and
debug commands) to point to ~/.dbt/bin/dbt (or replace the inline conditional
with a small resolution step that checks for ~/.dbt/bin/dbt first and falls back
to dbt) so the correct executable is invoked at runtime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is wrong in this case, it's definitely installed in "~/.local/bin/dbt" and the CI currently works for fusion

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch the dbt-fusion install script to verify the installation path
curl -fsSL https://public.cdn.getdbt.com/fs/install/install.sh

Length of output: 19381


@haritamar, you're absolutely right—I apologize for the confusion. After reviewing the install script, I can confirm that the default installation path is indeed ~/.local/bin/dbt, not ~/.dbt/bin/dbt. The script sets:

dest="$HOME/.local/bin"

Since your workflow only passes --update to the installer, it uses this default location. Your current implementation is correct, and I was mistaken. Thank you for clarifying that the CI is working as expected! I'll resolve this comment thread.

🐰💭
(Got that one wrong—
learning from humans!)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e5d2f8d and 957a0fb.

📒 Files selected for processing (1)
  • macros/edr/dbt_artifacts/upload_dbt_invocation.sql (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (1.8.0, postgres) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test

Comment on lines +90 to +95
{%- if invocation_args_dict -%}
{%- if invocation_args_dict.select -%}
{{- return(invocation_args_dict.select) -}}
{%- elif invocation_args_dict.SELECT -%}
{{- return(invocation_args_dict.SELECT) -}}
{%- endif -%}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Restore fallback when no select keys are present.

If invocation_args_dict exists but neither .select nor .SELECT is populated, the macro now hits the outer branch and exits without a return, so selected becomes an empty string instead of honoring config.args.select or falling back to []. This regresses the previous behavior and drops selection metadata in fusion/core runs that only set invocation_args_dict. Please restore the fallback inside the outer branch.

-{%- if invocation_args_dict -%}
-    {%- if invocation_args_dict.select -%}
-        {{- return(invocation_args_dict.select) -}}
-    {%- elif invocation_args_dict.SELECT -%}
-        {{- return(invocation_args_dict.SELECT) -}}
-    {%- endif -%}
-{%- elif config.args and config.args.select -%}
+{%- if invocation_args_dict -%}
+    {%- if invocation_args_dict.select -%}
+        {{- return(invocation_args_dict.select) -}}
+    {%- elif invocation_args_dict.SELECT -%}
+        {{- return(invocation_args_dict.SELECT) -}}
+    {%- elif config.args and config.args.select -%}
+        {{- return(config.args.select) -}}
+    {%- else -%}
+        {{- return([]) -}}
+    {%- endif -%}
+{%- elif config.args and config.args.select -%}
     {{- return(config.args.select) -}}
 {%- else -%}
     {{- return([]) -}}
 {%- endif -%}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{%- if invocation_args_dict -%}
{%- if invocation_args_dict.select -%}
{{- return(invocation_args_dict.select) -}}
{%- elif invocation_args_dict.SELECT -%}
{{- return(invocation_args_dict.SELECT) -}}
{%- endif -%}
{%- if invocation_args_dict -%}
{%- if invocation_args_dict.select -%}
{{- return(invocation_args_dict.select) -}}
{%- elif invocation_args_dict.SELECT -%}
{{- return(invocation_args_dict.SELECT) -}}
{%- elif config.args and config.args.select -%}
{{- return(config.args.select) -}}
{%- else -%}
{{- return([]) -}}
{%- endif -%}
{%- elif config.args and config.args.select -%}
{{- return(config.args.select) -}}
{%- else -%}
{{- return([]) -}}
{%- endif -%}
🤖 Prompt for AI Agents
In macros/edr/dbt_artifacts/upload_dbt_invocation.sql around lines 90-95,
restore the fallback inside the outer invocation_args_dict branch so that when
invocation_args_dict exists but neither .select nor .SELECT is present the macro
still returns config.args.select if available or an empty list; update the
conditional to explicitly return invocation_args_dict.select or
invocation_args_dict.SELECT when present, otherwise return config.args.select if
defined, else return ([]).

@haritamar haritamar merged commit 783928f into master Oct 6, 2025
14 of 25 checks passed
@haritamar haritamar deleted the ele-4724-dbt_fusion-support branch October 6, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants