Skip to content

Commit 21f06dd

Browse files
authored
Docs(dbt): update for yaml config (#5169)
1 parent 19270e1 commit 21f06dd

File tree

1 file changed

+88
-26
lines changed

1 file changed

+88
-26
lines changed

docs/integrations/dbt.md

Lines changed: 88 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -44,16 +44,36 @@ Prepare an existing dbt project to be run by SQLMesh by executing the `sqlmesh i
4444
$ sqlmesh init -t dbt
4545
```
4646

47-
SQLMesh will use the data warehouse connection target in your dbt project `profiles.yml` file. The target can be changed at any time.
47+
This will create a file called `sqlmesh.yaml` containing the [default model start date](../reference/model_configuration.md#model-defaults). This configuration file is a minimum starting point for enabling SQLMesh to work with your DBT project.
48+
49+
As you become more comfortable with running your project under SQLMesh, you may specify additional SQLMesh [configuration](../reference/configuration.md) as required to unlock more features.
50+
51+
!!! note "profiles.yml"
52+
53+
SQLMesh will use the existing data warehouse connection target from your dbt project's `profiles.yml` file so the connection configuration does not need to be duplicated in `sqlmesh.yaml`. You may change the target at any time in the dbt config and SQLMesh will pick up the new target.
4854

4955
### Setting model backfill start dates
5056

51-
Models **require** a start date for backfilling data through use of the `start` configuration parameter. `start` can be defined individually for each model in its `config` block or globally in the `dbt_project.yml` file as follows:
57+
Models **require** a start date for backfilling data through use of the `start` configuration parameter. `start` can be defined individually for each model in its `config` block or globally in the `sqlmesh.yaml` file as follows:
5258

53-
```
54-
> models:
55-
> +start: Jan 1 2000
56-
```
59+
=== "sqlmesh.yaml"
60+
61+
```yaml
62+
model_defaults:
63+
start: '2000-01-01'
64+
```
65+
66+
=== "dbt Model"
67+
68+
```jinja
69+
{{
70+
config(
71+
materialized='incremental',
72+
start='2000-01-01',
73+
...
74+
)
75+
}}
76+
```
5777

5878
### Configuration
5979

@@ -63,47 +83,89 @@ SQLMesh derives a project's configuration from its dbt configuration files. This
6383

6484
[Certain engines](https://sqlmesh.readthedocs.io/en/stable/guides/configuration/?h=unsupported#state-connection), like Trino, cannot be used to store SQLMesh's state.
6585

66-
As a workaround, we recommend specifying a supported state engine using the `state_connection` argument instead.
86+
In addition, even if your warehouse is supported for state, you may find that you get better performance by using a [traditional database](../concepts/state.md) to store state as these are a better fit for the state workload than a warehouse optimized for analytics workloads.
6787

68-
Learn more about how to configure state connections in Python [here](https://sqlmesh.readthedocs.io/en/stable/guides/configuration/#state-connection).
88+
In these cases, we recommend specifying a [supported production state engine](../concepts/state.md#state) using the `state_connection` configuration.
6989

70-
#### Runtime vars
90+
This involves updating `sqlmesh.yaml` to add a gateway configuration for the state connection:
7191

72-
dbt supports passing variable values at runtime with its [CLI `vars` option](https://docs.getdbt.com/docs/build/project-variables#defining-variables-on-the-command-line).
92+
```yaml
93+
gateways:
94+
"": # "" (empty string) is the default gateway
95+
state_connection:
96+
type: postgres
97+
...
7398

74-
In SQLMesh, these variables are passed via configurations. When you initialize a dbt project with `sqlmesh init`, a file `config.py` is created in your project directory.
99+
model_defaults:
100+
start: '2000-01-01'
101+
```
75102
76-
The file creates a SQLMesh `config` object pointing to the project directory:
103+
Or, for a specific dbt profile defined in `profiles.yml`, eg `dev`:
77104

78-
```python
79-
config = sqlmesh_config(Path(__file__).parent)
105+
```yaml
106+
gateways:
107+
dev: # must match the target dbt profile name
108+
state_connection:
109+
type: postgres
110+
...
111+
112+
model_defaults:
113+
start: '2000-01-01'
80114
```
81115

82-
Specify runtime variables by adding a Python dictionary to the `sqlmesh_config()` `variables` argument.
116+
Learn more about how to configure state connections [here](https://sqlmesh.readthedocs.io/en/stable/guides/configuration/#state-connection).
117+
118+
#### Runtime vars
119+
120+
dbt supports passing variable values at runtime with its [CLI `vars` option](https://docs.getdbt.com/docs/build/project-variables#defining-variables-on-the-command-line).
121+
122+
In SQLMesh, these variables are passed via configurations. When you initialize a dbt project with `sqlmesh init`, a file `sqlmesh.yaml` is created in your project directory.
123+
124+
You may define global variables in the same way as a native project by adding a `variables` section to the config.
83125

84126
For example, we could specify the runtime variable `is_marketing` and its value `no` as:
85127

86-
```python
87-
config = sqlmesh_config(
88-
Path(__file__).parent,
89-
variables={"is_marketing": "no"}
90-
)
128+
```yaml
129+
variables:
130+
is_marketing: no
131+
132+
model_defaults:
133+
start: '2000-01-01'
91134
```
92135

136+
Variables can also be set at the gateway/profile level which override variables set at the project level. See the [variables documentation](../concepts/macros/sqlmesh_macros.md#gateway-variables) to learn more about how to specify them at different levels.
137+
138+
#### Combinations
139+
93140
Some projects use combinations of runtime variables to control project behavior. Different combinations can be specified in different `sqlmesh_config` objects, with the relevant configuration passed to the SQLMesh CLI command.
94141

142+
!!! info "Python config"
143+
144+
Switching between different config objects requires the use of [Python config](../guides/configuration.md#python) instead of the default YAML config.
145+
146+
You will need to create a file called `config.py` in the root of your project with the following contents:
147+
148+
```py
149+
from pathlib import Path
150+
from sqlmesh.dbt.loader import sqlmesh_config
151+
152+
config = sqlmesh_config(Path(__file__).parent)
153+
```
154+
155+
Note that any config from `sqlmesh.yaml` will be overlayed on top of the active Python config so you dont need to remove the `sqlmesh.yaml` file
156+
95157
For example, consider a project with a special configuration for the `marketing` department. We could create separate configurations to pass at runtime like this:
96158

97159
```python
98160
config = sqlmesh_config(
99-
Path(__file__).parent,
100-
variables={"is_marketing": "no", "include_pii": "no"}
101-
)
161+
Path(__file__).parent,
162+
variables={"is_marketing": "no", "include_pii": "no"}
163+
)
102164
103165
marketing_config = sqlmesh_config(
104-
Path(__file__).parent,
105-
variables={"is_marketing": "yes", "include_pii": "yes"}
106-
)
166+
Path(__file__).parent,
167+
variables={"is_marketing": "yes", "include_pii": "yes"}
168+
)
107169
```
108170

109171
By default, SQLMesh will use the configuration object named `config`. Use a different configuration by passing the object name to SQLMesh CLI commands with the `--config` option. For example, we could run a `plan` with the marketing configuration like this:

0 commit comments

Comments
 (0)