|
| 1 | +--- |
| 2 | +title: Open Source |
| 3 | +icon: material/package-variant |
| 4 | +--- |
| 5 | + |
| 6 | + |
| 7 | +## Install |
| 8 | + |
| 9 | +From within a dbt project directory: |
| 10 | +```shell |
| 11 | +cd your-dbt-project/ # if you're not already there |
| 12 | +pip install -U recce |
| 13 | +``` |
| 14 | + |
| 15 | + |
| 16 | +## Launch |
| 17 | +To start Recce in the current environment: |
| 18 | +```shell |
| 19 | +recce server |
| 20 | +``` |
| 21 | +Launching Recce enables: |
| 22 | + |
| 23 | +- **Lineage clarity**: Trace changes down to the column level |
| 24 | + |
| 25 | +- **Query insights**: Explore logic and run custom queries |
| 26 | + |
| 27 | +- **Live diffing**: Reload and inspect changes as you iterate |
| 28 | + |
| 29 | +Best suited for quick exploration before moving to structured validation using Diff. |
| 30 | + |
| 31 | +<!-- <insert the gif of sign in flow step 2> --> |
| 32 | + |
| 33 | + |
| 34 | +## Configure Diff |
| 35 | + |
| 36 | +To compare changes, Recce needs a baseline. This guide explains the concept of Diff in Recce and how it fits into data validation workflows. Setup steps vary by environment, so this guide focuses on the core ideas rather than copy-paste instructions. |
| 37 | + |
| 38 | +For a concrete example, refer to the [5-minute Jaffle Shop tutorial](./get-started-jaffle-shop/). |
| 39 | + |
| 40 | +To configure a comparison in Recce, two components are required: |
| 41 | + |
| 42 | +### 1. Artifacts |
| 43 | + |
| 44 | +Recce uses dbt [artifacts](https://docs.getdbt.com/reference/artifacts/dbt-artifacts) to perform diffs. These files are generated with each dbt run and typically saved in the `target/` folder. |
| 45 | + |
| 46 | +In addition to the current artifacts, a second set is needed to serve as the baseline for comparison. Recce looks for these in the `target-base/` folder. |
| 47 | + |
| 48 | +- `target/` – Artifacts from the current development environment |
| 49 | +- `target-base/` – Artifacts from a baseline environment (e.g., production) |
| 50 | + |
| 51 | +For most setups, retrieve the existing artifacts that generated from the main branch (usually from a CI run or build cache) and save them into a `target-base/` folder. |
| 52 | + |
| 53 | +### 2. Schemas |
| 54 | + |
| 55 | +Recce also compares the actual query results between two dbt [environments](https://docs.getdbt.com/docs/core/dbt-core-environments), each pointing to a different [schema](https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles#understanding-target-schemas). This allows validation beyond metadata by comparing the data itself. |
| 56 | + |
| 57 | +For example: |
| 58 | + |
| 59 | +- `prod` schema for production |
| 60 | +- `dev` schema for development |
| 61 | + |
| 62 | +These schemas represent where dbt builds its models. |
| 63 | + |
| 64 | +!!! tip |
| 65 | + |
| 66 | + In dbt, an environment typically maps to a schema. To compare data results, separate schemas are required. Learn more in [dbt environments](https://docs.getdbt.com/docs/core/dbt-core-environments). |
| 67 | + |
| 68 | +Schemas are typically configured in the `profiles.yml` file, which defines how dbt connects to the data platform. Both schemas must be accessible for Recce to perform environment-based comparisons. |
| 69 | + |
| 70 | +Once both artifacts and schemas are configured, Recce can surface meaningful diffs across logic, metadata, and data. |
| 71 | + |
| 72 | +## Verify your setup |
| 73 | + |
| 74 | +There are two ways to check that your configuration is complete: |
| 75 | + |
| 76 | +### 1. Debug Command (CLI) |
| 77 | + |
| 78 | +Run `recce debug` from the command line to verify your setup before launching the server: |
| 79 | + |
| 80 | +```bash |
| 81 | +recce debug |
| 82 | +``` |
| 83 | + |
| 84 | +This command checks artifacts, directories, and warehouse connection, providing detailed feedback on any missing components. |
| 85 | + |
| 86 | +### 2. Environment Info (Web UI) |
| 87 | + |
| 88 | +Use **Environment Info** in the top-right corner of the Recce web interface to verify your configuration. |
| 89 | + |
| 90 | +A correctly configured setup will display two environments: |
| 91 | + |
| 92 | +- **Base** – the reference schema used for comparison (e.g., production) |
| 93 | +- **Current** – the schema for the environment under development (e.g., staging or dev) |
| 94 | + |
| 95 | +This confirms that both the artifacts and schemas are properly connected for diffing. |
| 96 | + |
| 97 | + |
| 98 | + |
| 99 | +# Start with dbt Cloud |
| 100 | + |
| 101 | +dbt Cloud is a hosted service that provides a managed environment for running dbt projects by [dbt Labs](https://docs.getdbt.com/docs/cloud/about-cloud/dbt-cloud-features). This document provides a step-by-step guide to get started `recce` with dbt Cloud. |
| 102 | + |
| 103 | +## Prerequisites |
| 104 | + |
| 105 | +`Recce` will compare the data models between two environments. That means you need to have two environments in your dbt Cloud project. For example, one for production and another for development. |
| 106 | +Also, you need to provide the credentials profile for both environments in your `profiles.yml` file to let `Recce` access your data warehouse. |
| 107 | + |
| 108 | +### Suggestions for setting up dbt Cloud |
| 109 | + |
| 110 | +To integrate the dbt Cloud with Recce, we suggest to set up two run jobs in your dbt Cloud project. |
| 111 | + |
| 112 | +#### Production Run Job |
| 113 | + |
| 114 | +The production run should be the main branch of your dbt project. You can trigger the dbt Cloud job on every merge to the main branch or schedule it to run at a daily specific time. |
| 115 | + |
| 116 | +#### Development Run Job |
| 117 | + |
| 118 | +The development run should be a separate branch of your dbt project. You can trigger the dbt Cloud job on every merge to the pull-request branch. |
| 119 | + |
| 120 | +### Set up dbt profiles with credentials |
| 121 | + |
| 122 | +You need to provide the credentials profile for both environments in your `profiles.yml` file. Here is an example of how your `profiles.yml` file might look like: |
| 123 | + |
| 124 | +```yaml |
| 125 | +dbt-example-project: |
| 126 | + target: dev |
| 127 | + outputs: |
| 128 | + dev: |
| 129 | + type: snowflake |
| 130 | + account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}" |
| 131 | + |
| 132 | + # User/password auth |
| 133 | + user: "{{ env_var('SNOWFLAKE_USER') | as_text }}" |
| 134 | + password: "{{ env_var('SNOWFLAKE_PASSWORD') | as_text }}" |
| 135 | + |
| 136 | + role: DEVELOPER |
| 137 | + database: cloud_database |
| 138 | + warehouse: LOAD_WH |
| 139 | + schema: "{{ env_var('SNOWFLAKE_SCHEMA') | as_text }}" |
| 140 | + threads: 4 |
| 141 | + prod: |
| 142 | + type: snowflake |
| 143 | + account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}" |
| 144 | + |
| 145 | + # User/password auth |
| 146 | + user: "{{ env_var('SNOWFLAKE_USER') | as_text }}" |
| 147 | + password: "{{ env_var('SNOWFLAKE_PASSWORD') | as_text }}" |
| 148 | + |
| 149 | + role: DEVELOPER |
| 150 | + database: cloud_database |
| 151 | + warehouse: LOAD_WH |
| 152 | + schema: PUBLIC |
| 153 | + threads: 4 |
| 154 | +``` |
| 155 | +
|
| 156 | +## Install `Recce` |
| 157 | + |
| 158 | +Install Recce using `pip`: |
| 159 | + |
| 160 | +```shell |
| 161 | +pip install -U recce |
| 162 | +``` |
| 163 | + |
| 164 | +## Execute Recce with dbt Cloud |
| 165 | + |
| 166 | +To compare the data models between two environments, you need to download the dbt Cloud artifacts for both environments. The artifacts include the manifest.json file and the catalog.json file. You can download the artifacts from the dbt Cloud UI. |
| 167 | + |
| 168 | +### Login to your dbt Cloud account |
| 169 | + |
| 170 | + |
| 171 | + |
| 172 | +### Go to the project you want to compare |
| 173 | + |
| 174 | + |
| 175 | + |
| 176 | +### Download the dbt artifacts |
| 177 | + |
| 178 | +Download the artifacts from the latest run of both run jobs. You can download the artifacts from the `Artifacts` tab. |
| 179 | + |
| 180 | + |
| 181 | + |
| 182 | + |
| 183 | +### Set up the dbt artifacts folders |
| 184 | + |
| 185 | +Extract the downloaded artifacts and keep them in a separate folder. The production artifacts should be in the `target-base` folder and the development artifacts should be in the `target` folder. |
| 186 | + |
| 187 | +```bash |
| 188 | +$ tree target target-base |
| 189 | +target |
| 190 | +├── catalog.json |
| 191 | +└── manifest.json |
| 192 | +target-base/ |
| 193 | +├── catalog.json |
| 194 | +└── manifest.json |
| 195 | +``` |
| 196 | + |
| 197 | +### Setup dbt project |
| 198 | + |
| 199 | +Move the `target` and `target-base` folders to the root of your dbt project. |
| 200 | +You should also have the `profiles.yml` file in the root of your dbt project with the credentials profile for both environments. |
| 201 | + |
| 202 | +### Start the `Recce` server |
| 203 | + |
| 204 | +Run the `recce` command to compare the data models between the two environments. |
| 205 | + |
| 206 | +```shell |
| 207 | +recce server |
| 208 | +``` |
| 209 | + |
0 commit comments