dbt-labs · luna-bianca · Jan 13, 2026 · Nov 28, 2025 · Nov 28, 2025 · Nov 28, 2025
@@ -140,14 +140,35 @@ where
 
 Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs it will evaluate to true and we’ll apply our filter logic, capturing only the newer data.
 
-### Late arriving facts
+### Late-arriving facts
 
 Our last concern specific to incremental models is what to do when data is inevitably loaded in a less-than-perfect way. Sometimes data loaders will, for a variety of reasons, load data late. Either an entire load comes in late, or some rows come in on a load after those with which they should have. The following is best practice for every incremental model to slow down the drift this can cause.
 
 - 🕐 For example if most of our records for `2022-01-30` come in the raw schema of our warehouse on the morning of `2022-01-31`, but a handful don’t get loaded til `2022-02-02`, how might we tackle that? There will already be `max(updated_at)` timestamps of `2022-01-31` in the warehouse, filtering out those late records. **They’ll never make it to our model.**
 - 🪟 To mitigate this, we can add a **lookback window** to our **cutoff** point. By **subtracting a few days** from the `max(updated_at)`, we would capture any late data within the window of what we subtracted.
 - 👯 As long as we have a **`unique_key` defined in our config**, we’ll simply update existing rows and avoid duplication. We process more data this way, but in a fixed way, and it keeps our model hewing closer to the source data.
 
+
+#### Using state-aware orchestration with incremental models
+
+By default, [state-aware orchestration](/docs/deploy/state-aware-about) detects source freshness by checking warehouse metadata for any new rows. This may cause models to run more often than needed.
+
+To avoid this issue, configure a `loaded_at_field` for a specific timestamp column or use a `loaded_at_query` with custom SQL to tell dbt which field to check for freshness. This helps state-aware orchestration to detect only genuinely new data. For information on how to configure `loaded_at_field` and `loaded_at_query`, refer to [Source freshness](/reference/resource-properties/freshness) and [Advanced configurations](/docs/deploy/state-aware-setup#advanced-configurations).
+
+Even with a `loaded_at_field` or `loaded_at_query`, late arriving records may have an earlier event timestamp (for example, `event_date`). In this case, state-aware orchestration may skip rebuilding the incremental model, even though your lookback window would normally pick up those records. To ensure late-arriving data is detected, configure your `loaded_at_field` or `loaded_at_query` to align with the same lookback window used in your incremental filter. For example, if your incremental model uses a 3-day lookback window:
+
+```yaml
+sources:
+  - name: raw_orders
+    tables:
+      - name: orders
+        config:
+          loaded_at_query: |
+            select max(ingested_at)
+            from {{ this }}
+            where ingested_at >= current_timestamp - interval '3 days'
+```
+
 ### Long-term considerations
 
 Late arriving facts point to the biggest tradeoff with incremental models:

@@ -36,10 +36,23 @@ State-aware orchestration does not depend on [static analysis](/docs/fusion/new-
 
 State-aware orchestration uses shared state tracking to determine which models need to be built by detecting changes in code or data every time a job runs. It also supports custom refresh intervals and custom source freshness configurations, so <Constant name="cloud" /> only rebuilds models when they're actually needed.
 
-For example, you can configure your project so that <Constant name="cloud" /> skips rebuilding the dim_wizards model (and its parents) if they’ve already been refreshed within the last 4 hours, even if the job itself runs more frequently.
+For example, you can configure your project so that <Constant name="cloud" /> skips rebuilding the `dim_wizards` model (and its parents) if they’ve already been refreshed within the last 4 hours, even if the job itself runs more frequently.
 
 Without configuring anything, <Constant name="cloud" />'s state-aware orchestration automatically knows to build your models either when the code has changed or if there’s any new data in a source (or upstream model in the case of [dbt Mesh](/docs/mesh/about-mesh)).
 
+### Handling concurrent jobs
+
+If two separate jobs both depend on the same downstream model (for example, `model_ab`), and both jobs detect upstream changes (`updates_on = any`), then `model_ab` may run twice (once per job) because each job detects a change that triggers a rebuild. However, if nothing has changed since the most recent build, neither job needs to rebuild `model_ab`. They will reuse the already built `model_ab` instead of rebuilding it again.
+
+Under state-aware orchestration, all jobs read and write from the same shared state and build a model only when either the code or data state has changed. This means that each job individually evaulates whether a model needs rebuilding based on the model’s compiled code and upstream data state.
+
+What happens when jobs overlap:
+
+- If both jobs reach the same model at exactly the same time, one job waits until the other finishes. This is to prevent collisions in the data warehouse when two jobs try to build the same model at the same time.
+- After the first job finishes, the second job still checks whether a rebuild for the model is needed. The job may choose to reuse the existing result or perform another build, depending on changes detected.
+
+If you want to prevent a job from being built too frequently even when the code or data state has changed, you can reduce build frequency by using the `build_after` config. For information on how to use `build_after`, refer to [Model freshness](/reference/resource-configs/freshness) and [Advanced configurations](/docs/deploy/state-aware-setup#advanced-configurations).
+
 ## Efficient testing in state-aware orchestration <Lifecycle status="private_beta" />
 
 :::info Private beta feature
@@ -134,8 +147,12 @@ The following section lists some considerations when using Efficient testing in
     store_failures: true | false
     where: <string>
   ```
+
+ - **Efficient testing is available only in deploy jobs**. CI and merge jobs currently do not have the option to enable this feature. 
+
+## Related FAQs
 
-- **Efficient testing is available only in deploy jobs**. CI and merge jobs currently do not have the option to enable this feature. 
+<FAQ path="Runs/sao-difference-core" />
 
 ## Related docs
 

@@ -92,24 +92,23 @@ import DeleteJob from '/snippets/_delete-job.md';
 
 By default, we use the warehouse metadata to check if sources (or upstream models in the case of Mesh) are fresh. For more advanced use cases, dbt provides other options that enable you to specify what gets run by state-aware orchestration. 
 
-You can customize with:
-- `loaded_at_field`: Specify a specific column to use from the data.
+You can use the following optional parameters to customize your state-aware orchestration:
 
-- `loaded_at_query`: Define a custom freshness condition in SQL to account for partial loading or streaming data.
+|Parameter | Description | Allowed values | Supports Jinja |
+|----------|-------------| -------------- | -------------- |
+| `loaded_at_field` | Specifies a specific column to use from the data. | Name of timestamp column. For example, `created_at`, `"CAST(created_at AS TIMESTAMP)"`. | ✅ |
+| `loaded_at_query` | Defines a custom freshness condition in SQL to account for partial loading or streaming data. | SQL string. For example, `"select {{ current_timestamp() }}"`. | ✅ |
+| `build_after.count` | Determines how many units of time must pass before a model can be rebuilt to help reduce build frequency. | A positive integer or a Jinja expression. For example, `4` or `"{{ var('build_after_count', 4) }}"`. | ✅ |
+| `build_after.period` | The time unit for the count to define the build interval. | `minute`, `hour`, `day`, or a Jinja expression (for example, `"{{ var('build_after_period', 'day') }}"`). | ✅ |
+| `build_after.updates_on` | Determines whether a model rebuild is triggered when any upstream dependency has fresh data or only when all upstream dependencies are fresh. | <li>`any` (default) &mdash; Use this value when you want a downstream model to rebuild if _any_ of its upstream dependencies receives fresh data, even if others haven’t.</li> <li>`all` &mdash; Use this value when you want to trigger a rebuild only when _all_ upstream dependencies are fresh &mdash; minimizing unnecessary builds and reducing compute cost. Recommended to use in state-aware orchestration.</li> | ❌ |
 
-If a source is a view in the data warehouse, dbt can’t track updates from the warehouse metadata when the view changes. Without a `loaded_at_field` or `loaded_at_query`, dbt treats the source as "always fresh” and emits a warning during freshness checks. To check freshness for sources that are views, add a `loaded_at_field` or `loaded_at_query` to your configuration.
+Some notes when using `loaded_at_field` or `loaded_at_query`:
+- You can either define `loaded_at_field` or `loaded_at_query` but not both.
+- If a source is a view in the data warehouse, dbt can’t track updates from the warehouse metadata when the view changes. Without a `loaded_at_field` or `loaded_at_query`, dbt treats the source as "always fresh” and emits a warning during freshness checks. To check freshness for sources that are views, add a `loaded_at_field` or `loaded_at_query` to your configuration.
 
-:::note 
-You can either define `loaded_at_field` or `loaded_at_query` but not both.
-:::
-You can also customize with:
-- `updates_on`: Change the default from `any` to `all` so it doesn’t build unless all upstreams have fresh data reducing compute even more.
-- `build_after`: Don’t build a model more often than every x period to reduce build frequency when you need data less often than sources are fresh.
-
-
-To learn more about model freshness and build after, refer to [model `freshness` config](/reference/resource-configs/freshness). To learn more about source and upstream model freshness configs, refer to [resource `freshness` config](/reference/resource-properties/freshness).
+To learn more about model freshness and `build_after`, refer to [model `freshness` config](/reference/resource-configs/freshness). To learn more about source and upstream model freshness configs, refer to [resource `freshness` config](/reference/resource-properties/freshness).
 
-## Customizing behavior
+### Customizing behavior
 
 You can optionally configure state-aware orchestration when you want to fine-tune orchestration behavior for these reasons:
 
@@ -142,6 +141,37 @@ You can optionally configure state-aware orchestration when you want to fine-tun
   - `model/properties.yml` at the model level in YAML
   - `model/model.sql` at the model level in SQL
 These configurations are powerful because you can define a sensible default at the project level or for specific model folders, and override it for individual models or model groups that require more frequent updates.
+
+### Handling late-arriving data 
+
+If your incremental models use a lookback window to capture late-arriving data, make sure your freshness logic aligns with that window.
+
+When you use a `loaded_at_field` or `loaded_at_query`, state-aware orchestration uses that value to determine whether new data has arrived. When the `loaded_at` value reflects an event timestamp (for example, `event_date`), late-arriving records may not update this value if the event occurred in the past. In these cases, state-aware orchestration may not trigger a rebuild, even though your incremental model’s lookback window would normally include those rows.
+
+To ensure late-arriving data is detected by state-aware orchestration, your `loaded_at_field` or `loaded_at_query` should align with the same lookback window used in your incremental filter. See the following sample values for `loaded_at_field` and `loaded_at_query`:
+
+
+<Tabs>
+<TabItem value="loaded_at_field" label="loaded_at_field">
+
+```yaml
+loaded_at_field: ingested_at
+```
+</TabItem>
+
+<TabItem value="loaded_at_query" label="loaded_at_query">
+
+```yaml
+loaded_at_query: |
+  select max(ingested_at)
+  from source_table
+  where ingested_at >= current_timestamp - interval '3 days'
+```
+
+</TabItem>
+</Tabs>
+
+
 ## Example
 
 Let's use an example to illustrate how to customize our project so a model and its parent model are rebuilt only if they haven't been refreshed in the past 4 hours &mdash; even if a job runs more frequently than that.

@@ -0,0 +1,20 @@
+---
+title: How is state-aware orchestration different from using selectors in dbt Core?
+description: "Compare how state-aware orchestration differs from using selectors in dbt Core"
+sidebar_label: 'State-aware orchestration vs selectors in dbt Core'
+id: sao-difference-core
+
+---
+
+In <Constant name="core" /> , running with the selectors `state:modified+` and `source_status:fresher+` builds models that either:
+
+- Have changed since the prior run (`state:modified+`)
+- Have upstream sources that are fresher than in the prior run (`source_status:fresher+`)
+
+Instead of relying only on these selectors and prior-run artifacts, state-aware orchestration decides whether to rebuild a model based on:
+
+- Compiled SQL diffs that ignore non-meaningful changes like whitespace and comments
+- Upstream data changes at runtime and model-level freshness settings
+- Shared state across jobs
+
+While <Constant name="core" /> uses selectors like `state:modified+` and `source_status:fresher+` to decide what to build _only for a single run in a single job_, state-aware orchestration with <Constant name="fusion" /> maintains a _shared, real-time model state across every job in the environment_ and uses that state to determine whether a model’s code or upstream data have actually changed before rebuilding. This ensures dbt only rebuilds models when something has changed, no matter which job runs them.