Add semantic model parsing architecture doc (docs/arch/3.3_Semantic_Models.md) by theyostalservice · Pull Request #12765 · dbt-labs/dbt-core

theyostalservice · 2026-04-01T22:15:32Z

Why

docs/arch/ has detailed architecture docs for parsing (3_Parsing.md, 3.1_Partial_Parsing.md) but nothing covering semantic model parsing. Investigating DI-3697 required ~20 minutes of exploratory code reading to reconstruct knowledge that would have taken 2 minutes with a reference doc. Adding this now so future contributors (and AI agents) can orient quickly.

What

New: docs/arch/3.3_Semantic_Models.md — covers:
- v1 standalone vs v2 inline authoring formats and their parsing paths
- Key files with a table (unparsed.py, schema_yaml_readers.py, schemas.py, files.py, partial.py)
- SchemaSourceFile tracking fields (semantic_models, node_patches, metrics_from_measures, etc.)
- Full parsing flow traces for both v1 and v2
- Partial parsing considerations including the v2 gap fixed in DI-3697 and a known remaining limitation
- Testing patterns, test locations, and fixture conventions
Updated: AGENTS.md — adds an "Architecture Documentation" section at the top pointing to docs/arch/ with a quick-reference table of key docs
New - added another doc for troubleshooting SL parsing issues. This is related to a number of user requests, but was generated out of the work done for Improve error message for unknown fields in semantic_model config #12766 .

Refs

Related fix: Fix partial parsing duplicate for v2 inline semantic models #12763 (DI-3697)
related to the troubleshooting document : Improve error message for unknown fields in semantic_model config #12766

Drafted by claude-sonnet-4-6 under the direction of @theyostalservice

… AGENTS.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

codecov · 2026-04-01T22:18:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.49%. Comparing base (eee9587) to head (36a9727).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #12765      +/-   ##
==========================================
+ Coverage   91.41%   91.49%   +0.07%     
==========================================
  Files         203      203              
  Lines       25844    25945     +101     
==========================================
+ Hits        23626    23739     +113     
+ Misses       2218     2206      -12

Flag	Coverage Δ
integration	`88.38% <ø> (+0.09%)`	⬆️
unit	`65.75% <ø> (+0.19%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Unit Tests	`65.75% <ø> (+0.19%)`	⬆️
Integration Tests	`88.38% <ø> (+0.09%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AGENTS.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

QMalcolm

This is on the right track! Thank you for putting this together. One thing we should consider before moving forward is what is the status of v1 specification. Supported but not encouraged? That is we shouldn't break it, but it is not where improvements/changes shoudl happen.

QMalcolm · 2026-04-01T23:28:41Z

docs/arch/3.3_Semantic_Models.md

+
+## Overview
+
+Semantic models are first-class resources in dbt-core that expose model data to MetricFlow for metric computation. They define the *entities*, *dimensions*, and *measures* of a model in terms the Semantic Layer can query. Parsing produces `SemanticModel` nodes in the manifest, which are later validated by `dbt_semantic_interfaces`.


which are later validated by dbt_semantic_interfaces

Soon to be out of date 😂 No change needed here yet, just found it entertaining

QMalcolm · 2026-04-01T23:44:37Z

docs/arch/3.3_Semantic_Models.md

+Defined as an independent entry under a top-level `semantic_models:` key in any schema YAML file:
+
+```yaml
+semantic_models:
+  - name: revenue
+    model: ref('fct_revenue')
+    entities:
+      - name: transaction
+        type: primary
+    dimensions:
+      - name: ds
+        type: time
+        type_params:
+          time_granularity: day
+    measures:
+      - name: revenue
+        agg: sum
+        expr: amount
+```
+
+Parsed by `SemanticModelParser.parse()` in `schema_yaml_readers.py`. The semantic model is a fully independent entry in the YAML; its `model: ref('...')` field links it to the referenced model node via `depends_on`.


Is v1 deprecated? I.e. do we want to no longer encourage the authoring of v1 metrics? If so we should probably note that in this file.

I'll update it with a note. The answer is that V2 YAML should be the default in all things going forward, but there are several specific situations where v1 supports things v2 does not, and we are not able to deprecate v1 at this time.

Add semantic model parsing architecture doc and surface docs/arch/ in…

cc7dd28

… AGENTS.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cla-bot bot added the cla:yes label Apr 1, 2026

theyostalservice mentioned this pull request Apr 1, 2026

Improve error message for unknown fields in semantic_model config #12766

Merged

theyostalservice commented Apr 1, 2026

View reviewed changes

AGENTS.md Outdated Show resolved Hide resolved

theyostalservice and others added 2 commits April 1, 2026 16:01

Add semantic layer parse failure troubleshooting doc

01a2690

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Small manual fixes and reordering.

36a9727

theyostalservice marked this pull request as ready for review April 1, 2026 23:15

theyostalservice requested a review from a team as a code owner April 1, 2026 23:15

theyostalservice requested a review from QMalcolm April 1, 2026 23:15

QMalcolm closed this in #12766 Apr 1, 2026

QMalcolm reopened this Apr 1, 2026

QMalcolm reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add semantic model parsing architecture doc (docs/arch/3.3_Semantic_Models.md)#12765

Add semantic model parsing architecture doc (docs/arch/3.3_Semantic_Models.md)#12765
theyostalservice wants to merge 3 commits intomainfrom
patricky/di-3697-semantic-model-arch-docs

theyostalservice commented Apr 1, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

QMalcolm left a comment

Uh oh!

QMalcolm Apr 1, 2026

Uh oh!

QMalcolm Apr 1, 2026

Uh oh!

theyostalservice Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		## Overview

		Semantic models are first-class resources in dbt-core that expose model data to MetricFlow for metric computation. They define the entities, dimensions, and measures of a model in terms the Semantic Layer can query. Parsing produces `SemanticModel` nodes in the manifest, which are later validated by `dbt_semantic_interfaces`.

Conversation

theyostalservice commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Refs

Uh oh!

codecov bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

QMalcolm left a comment

Choose a reason for hiding this comment

Uh oh!

QMalcolm Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

QMalcolm Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

theyostalservice Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

theyostalservice commented Apr 1, 2026 •

edited

Loading

codecov bot commented Apr 1, 2026 •

edited

Loading