Speeding Up Repeated DBT Builds #2123

alexrichey · 2025-12-19T17:02:53Z

alexrichey
Dec 19, 2025
Maintainer

When we're building a DBT project and running repeated builds in a schema, we're usually loading the same data, repeating the same calculations, etc. I think we could enable caching, similar to the local development experience with two components.

1. Use state to detect changed models

When developing locally, you can do this:

dbt build \
  --select state:modified+ \
  --state build-outputs/manifest.json

This will just detect changed models since the last run, then run all downstream models.

The build will output a manifest.json, and we could just store it in the postgres schema for the build as a JSON document, then download it before we kick off the DBT build. However, this misses a very import case: when your input data changes.

2. Detecting changed data

DBT doesn't really concern itself with changed data - it cares about changed models. So firstly... maybe we just don't attempt to solve this? In the build, we could just detect when a new recipe file is pushed, and in that case just don't allow the --state manifest.json

But if we wanted to do this, we could either

parse the manifest.json for table dependencies. We could determine which DBT models rely on input datasets, and select those in the run. I don't love this.
This might be overly clever, but suppose we had a DBT macro that could compute the hash of a table. Then in DBT models where we wanted to check whether input datasets changed, just put this in the comments:

-- models/stg_orders.sql
-- source_hash: {{ get_source_hash('raw', 'my_recipe_dataset') }}

SELECT * FROM {{ source('raw', 'my_recipe_dataset') }}

Then whenever my_recipe_dataset changed, the actual compiled model code would change as well... Combined with manifest.json, this would allow you to target models where its input dataset was changed.

Conclusion

It's not a burning issue, and caching is complicated. I think enabling 1) would be straightforward and helpful.

is probably overly clever, and I sort of love it for that. I'm not in a hurry to implement it though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speeding Up Repeated DBT Builds #2123

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Speeding Up Repeated DBT Builds #2123

Uh oh!

alexrichey Dec 19, 2025 Maintainer

1. Use state to detect changed models

2. Detecting changed data

Conclusion

Replies: 0 comments

alexrichey
Dec 19, 2025
Maintainer