Skip to content

Uploading run results fails if model is skipped (BigQuery)Β #1745

@lari

Description

@lari

Describe the bug
When a dbt run skips a BigQuery materialized view model, the Elementary package fails with an error Value has type STRING which cannot be inserted into column rows_affected, which has type INT64.

Looking at the SQL job in BigQuery console, I have identified the issue to be that Elementary tries to insert a value '-1' (with quotes) to the rows_affected column, which is INT64.

I have further identified that the value '-1' is not identified as number in the insert_rows macro on line {%- if value is number -%} here: https://github.com/elementary-data/dbt-data-reliability/blob/0.16.1/macros/utils/table_operations/insert_rows.sql#L191

Changing the line to {%- if value is number or value == '-1' -%} would fix the issue.

However, there's a question of how to store "skipped" run results at all?

To Reproduce
Steps to reproduce the behavior:

  1. Create a materialized view with on_configuration_change = 'continue' config
  2. dbt run to create the view
  3. Make a schema change in the materialized view
  4. dbt run

Expected behavior
The dbt project and Elementary package should run without errors.

Screenshots

Error message:

on-run-end failed, error:
Value has type STRING which cannot be inserted into column rows_affected, which has type INT64 at 

Environment (please complete the following information):

  • Elementary CLI (edr) version: n/a
  • Elementary dbt package version: 0.16.1
  • dbt version you're using: 1.8.7
  • Data warehouse: bigquery (1.8.3)
  • Infrastructure details: Dev Container based on python:3.11-slim-bullseye

Additional context

Here's the run_results.json from the run. As you can see, the rows_affected is set to "-1".

    "metadata": {
        "dbt_schema_version": "https://schemas.getdbt.com/dbt/run-results/v6.json",
        "dbt_version": "1.8.7",
        "generated_at": "2024-11-13T06:58:45.324884Z",
        "invocation_id": "...",
        "env": {}
    },
    "results": [
        {
            "status": "success",
            "timing": [
                {
                    "name": "compile",
                    "started_at": "2024-11-13T06:58:39.217536Z",
                    "completed_at": "2024-11-13T06:58:39.245089Z"
                },
                {
                    "name": "execute",
                    "started_at": "2024-11-13T06:58:39.245704Z",
                    "completed_at": "2024-11-13T06:58:39.630172Z"
                }
            ],
            "thread_id": "Thread-1 (worker)",
            "execution_time": 0.41369032859802246,
            "adapter_response": {
                "_message": "skip `project`.`dataset`.`table`",
                "code": "skip",
                "rows_affected": "-1"
            },
            "message": "skip `project`.`dataset`.`table`",
            "failures": null,
            "unique_id": "model.model_name",
            "compiled": true,
            "compiled_code": "...",
            "relation_name": "`project`.`dataset`.`table`"
        }
    ],
    "elapsed_time": 8.028469562530518,
    "args": {
        ...
    }
}

Would you be willing to contribute a fix for this issue?

Possibly yes, but there are design decision needed on how to handle run results for skipped models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions