Skip to content

Conversation

@NoyaArie
Copy link
Contributor

@NoyaArie NoyaArie commented Aug 21, 2025

null

Summary by CodeRabbit

  • Bug Fixes

    • Restored dimension-based anomaly results in test reports, showing per-dimension metrics, timestamps, and anomalous rows; behavior consistent across backends and unchanged for non-dimension tests.
  • Style

    • Whitespace and formatting cleanups for improved readability.
  • Chores

    • Updated dependency revision entries to newer package revisions for reproducible builds.

@linear
Copy link

linear bot commented Aug 21, 2025

@coderabbitai
Copy link

coderabbitai bot commented Aug 21, 2025

Walkthrough

Reformats get_test_results.sql, reinstates dimension-specific anomaly processing inside default__get_test_results and clickhouse__get_test_results, and updates dependency revisions in packages. For dimension-related tests the macro now builds anomalous_rows, extends headers, and overrides test_rows_sample. Macro signatures remain unchanged.

Changes

Cohort / File(s) Summary of Changes
Dimension anomaly flow reintroduced
elementary/monitor/dbt_project/macros/get_test_results.sql
Restores conditional dimension-based path in default__get_test_results and clickhouse__get_test_results: compute test_rows_sample; if test_sub_type == 'dimension' or test_params.dimensions then derive metric_name, iterate test_rows_sample to build anomalous_rows (capturing per-row metrics and end_time), extend headers with per-dimension entries, and set test_rows_sample to { headers, anomalous_rows }.
Formatting and whitespace cleanups
elementary/monitor/dbt_project/macros/get_test_results.sql
Removes stray blank line(s) and trims trailing spaces in several SQL lines; no changes to macro signatures or public dispatch API.
Dependency revision updates
elementary/monitor/dbt_project/package-lock.yml, elementary/monitor/dbt_project/packages.yml
package-lock.yml: added name for package entry dbt_utils and updated git-based package elementary revision and sha1_hash. packages.yml: updated git dependency https://github.com/elementary-data/dbt-data-reliability.git revision from 2579747... to b46d220....

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant Macro as get_test_results (default/clickhouse)
  Caller->>Macro: invoke(days_back, invocations_per_test, disable_passed_test_metrics)
  Macro->>Macro: compute test_rows_sample
  alt Dimension-related (test_sub_type == 'dimension' OR test_params.dimensions)
    Macro->>Macro: derive metric_name (row_count or test_params.dimensions)
    Macro->>Macro: iterate test_rows_sample to build anomalous_rows
    Note right of Macro #f9f0d9: capture per-row metrics and end_time
    Macro->>Macro: extend headers with per-dimension entries
    Macro->>Macro: set test_rows_sample = { headers, anomalous_rows }
  else Non-dimension
    Macro->>Macro: retain existing test_rows_sample flow
  end
  Macro-->>Caller: return results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • ofek1weiss
  • michael-myaskovsky

Poem

I twitch my whiskers, nose aglow,
New dims hop in where metrics flow.
Headers stretch, anomalies gleam,
Clickhouse and default—same bright stream.
Trimmed the fluff, logic snug and tight—🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ele-4942-dimension-anomalies-visualization

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link
Contributor

👋 @NoyaArie
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in this pull request.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
elementary/monitor/dbt_project/macros/get_test_results.sql (2)

259-261: Undefined variable: elementary_tests_allowlist_status in ClickHouse macro

In the default macro (Line 6) you set elementary_tests_allowlist_status, but the ClickHouse variant references it without defining it locally. Jinja macro scope won’t inherit that value, so this will raise an UndefinedError at render time.

Add the same initialization near the top of clickhouse__get_test_results:

 {%- macro clickhouse__get_test_results(days_back = 7, invocations_per_test = 720, disable_passed_test_metrics = false) -%}
-    {% do elementary.run_query('drop table if exists ordered_test_results') %}
+    {% set elementary_tests_allowlist_status = ['fail', 'warn'] if disable_passed_test_metrics else ['fail', 'warn', 'pass']  %}
+    {% do elementary.run_query('drop table if exists ordered_test_results') %}

110-153: Qualify the ClickHouse table operations with the package database and schema

We need to prevent accidental drops or orphaned tables when the current database isn’t the package’s database. In its current form, the clickhouse__get_test_results macro issues unqualified DROP TABLE and CREATE TABLE statements, then later reads from a fully qualified relation—this mismatch can lead to failures or leftover tables.

Changes required:

  • At the start of the ClickHouse macro, call elementary.get_package_database_and_schema() and bind elementary_database and elementary_schema.
  • Use those variables to qualify both the DROP TABLE IF EXISTS and the CREATE TABLE statements.
  • Continue to use the same elementary_database, elementary_schema, and identifier when calling api.Relation.create(…) and fully_drop_relation(…).

Locations in elementary/monitor/dbt_project/macros/get_test_results.sql:

  • Line ~221: unqualified {% do elementary.run_query('drop table if exists ordered_test_results') %}
  • Line ~224: CREATE TABLE ordered_test_results (
  • Line ~240: api.Relation.create(database=elementary_database, schema=elementary_schema, identifier='ordered_test_results', …)

Proposed diff:

 {%- macro clickhouse__get_test_results(days_back = 7, invocations_per_test = 720, disable_passed_test_metrics = false) -%}
-    {% do elementary.run_query('drop table if exists ordered_test_results') %}
+    {# Get the package DB and schema to fully qualify temp table names #}
+    {% set elementary_database, elementary_schema = elementary.get_package_database_and_schema() %}
+    {% do elementary.run_query(
+         'DROP TABLE IF EXISTS ' ~ elementary_database ~ '.' ~ elementary_schema ~ '.ordered_test_results'
+    ) %}

-    {% set create_table_query %}
-    CREATE TABLE ordered_test_results (
+    {% set create_table_query %}
+    CREATE TABLE {{ elementary_database }}.{{ elementary_schema }}.ordered_test_results (
         id String,
         invocation_id String,
         …
     )
     ENGINE = MergeTree()
     ORDER BY (elementary_unique_id, invocations_rank_index);
     {% endset %}
     {% do elementary.run_query(create_table_query) %}

     {% set insert_query %}
@@
-    {% set elementary_database, elementary_schema = elementary.get_package_database_and_schema() %}
     {% set ordered_test_results_relation = api.Relation.create(
         database=elementary_database,
         schema=elementary_schema,
         identifier='ordered_test_results',
         type='table'
     ) %}
…
     {% if not elementary.has_temp_table_support() %}
-        {% do elementary.fully_drop_relation(ordered_test_results_relation) %}
+        {% do elementary.fully_drop_relation(ordered_test_results_relation) %}
     {% endif %}

Implementing this ensures that every operation—DROP, CREATE, reads via Relation, and cleanup—targets the same, fully qualified table.

🧹 Nitpick comments (1)
elementary/monitor/dbt_project/macros/get_test_results.sql (1)

202-207: Remove unused first_occurred join to avoid unnecessary work

The subquery computes first_time_occurred but the column isn’t selected or used in predicates. This adds a redundant scan and join on a large table.

Apply this diff to remove the join:

-    LEFT JOIN (
-        SELECT
-            min(detected_at) AS first_time_occurred,
-            test_unique_id
-        FROM {{ ref('elementary', 'elementary_test_results') }}
-        GROUP BY test_unique_id
-    ) first_occurred ON etr.test_unique_id = first_occurred.test_unique_id
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d7a4fdc and 0555e31.

📒 Files selected for processing (1)
  • elementary/monitor/dbt_project/macros/get_test_results.sql (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: code-quality
🔇 Additional comments (5)
elementary/monitor/dbt_project/macros/get_test_results.sql (5)

92-107: AI summary mentions dimension anomalies handling, but it’s absent here

The PR summary says the macros rebuild per-dimension anomalous_rows and extend headers for dimension tests. In this loop we only attach sample_data = test_rows_sample, with no dimension-specific processing. If the intention was to reintroduce that path, it seems missing from the current diff. Please confirm whether this was intentionally deferred or omitted by mistake.


79-81: ClickHouse cleanup behavior: double-check drop path

For adapters without temp-table support you drop the relation, which is correct. With the ClickHouse change above (qualified table), confirm that elementary.has_temp_table_support() returns false in ClickHouse so the object is dropped reliably at the end of the macro. If it returns true for some reason, an unqualified CREATE may still persist.


21-60: Minor: selection shape LGTM

The selection and ranking look consistent; limiting invocations per test and including result_rows for sampling aligns with downstream usage.


71-75: Minor: valid_ids_query change LGTM

distinct id filtered to invocations_rank_index = 1 is correct for sampling current run rows.


231-235: Minor: valid_ids_query parity LGTM (ClickHouse)

Mirrors the default macro, keeping behavior consistent across adapters.

@NoyaArie NoyaArie temporarily deployed to elementary_test_env August 27, 2025 07:47 — with GitHub Actions Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
elementary/monitor/dbt_project/package-lock.yml (1)

6-8: Prefer tags over raw commits when a release is available.

For reproducibility and easier upgrades, consider pinning to a released tag once available instead of a commit SHA.

elementary/monitor/dbt_project/packages.yml (1)

4-5: Consider pinning to a release tag when cutting a release.

Once a new version is published, switch from commit SHA to a semantic version for stability.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0555e31 and 29daa5c.

📒 Files selected for processing (2)
  • elementary/monitor/dbt_project/package-lock.yml (1 hunks)
  • elementary/monitor/dbt_project/packages.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: code-quality
  • GitHub Check: test / test
🔇 Additional comments (3)
elementary/monitor/dbt_project/package-lock.yml (2)

2-4: Lock entry clarity for dbt_utils looks good.

Adding the explicit name alongside package and pinned version (0.8.6) is fine and within the declared range in packages.yml.


6-8: Revision bump verified

All checks passed: the dbt-data-reliability revision in packages.yml matches package-lock.yml (b46d220a21487a926f1cb50f5dc4c77e321f74c2), the commit exists upstream, and dbt_utils is locked at version 0.8.6 (within the [0.8.0, 0.9.0) range). No further action required.

elementary/monitor/dbt_project/packages.yml (1)

5-5: Revision updated; keep it in sync with package-lock and verify upstream.

Looks fine. Make sure package-lock.yml reflects the same revision and that the commit is reachable. See the script in the package-lock.yml comment.

@NoyaArie NoyaArie merged commit d0bc714 into master Aug 27, 2025
5 checks passed
@NoyaArie NoyaArie deleted the ele-4942-dimension-anomalies-visualization branch August 27, 2025 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants