Skip to content

[Feature] introduce state:modified.compiled #12034

@nstylo

Description

@nstylo

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

I would love select models for which only the compiled sql has changed. I think this would be a huge win, since often a model file change does not imply a change in the SQL reprensentation. Think of config declared tags, indexes which may run as a post_hook, or a macro name change. I propose to add a new selector for comparing compiled and normalized sql of the models. The selector would look like --select "state:modified.compiled".

    # file: contracts/graph/nodes.py
    def same_compiled(self, other) -> bool:
        """Compare compiled SQL content between two nodes

        Falls back to raw_code comparison if compiled_code is not available.
        This handles cases where nodes haven't been compiled yet or when
        comparing against previous state that may not have compiled_code.
        """
        # Get compiled SQL for both nodes
        self_compiled = getattr(self, "compiled_code", None)
        other_compiled = getattr(other, "compiled_code", None)

        # TODO: perhaps add some normalization, like trim whitespace, lowercase etc.

        # If both have compiled_code, compare that
        if self_compiled is not None and other_compiled is not None:
            return self_compiled == other_compiled

        # If only one or none have compiled_code, fall back to raw_code comparison
        # This handles cases where current node hasn't been compiled yet (e.g., during dbt ls)
        # but the state comparison node has compiled_code from a previous run
        return self.same_body(other)
            # file: graph/selector_methods.py
            ...
            "modified.body": self.check_modified_factory("same_body"),
            "modified.sql": self.check_modified_factory("same_compiled"),
            "modified.configs": self.check_modified_factory("same_config"),
            ...

There are two issues to this:

  1. the self Node is not compiled, so compiled_code = None.
  2. one has to make sure to point to a state which comes from dbt compile

I would find this super helpful to not run unnecesary and in my case super expensive builds. Thoughts?

Describe alternatives you've considered

No response

Who will this benefit?

everyone who doesn't want to rebuild when model SQL did not change.

Are you interested in contributing this feature?

Yes, I need to figure out how to use compiled node for dbt run/build/ls etc.

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions