You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/features/lineage.md
+19-47Lines changed: 19 additions & 47 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,22 +1,21 @@
1
1
---
2
2
title: Lineage
3
3
icon: material/file-tree
4
-
5
4
---
6
5
7
6
The Lineage Diff is the main interface to Recce and allows you to quickly see the potential area of impact from your dbt data modeling changes.
8
7
9
8
## Lineage Diff
10
-
It's from the Lineage Diff that you will determine which models to investigate further; and also perform the various data validation checks that will serve as proof-of-correctness of your work.
11
9
10
+
It's from the Lineage Diff that you will determine which models to investigate further; and also perform the various data validation checks that will serve as proof-of-correctness of your work.
Models are color-coded to indicate their **status**:
@@ -42,56 +41,52 @@ The two icons at the bottom right of each node indicate if a `row count` or `sch
42
41
Click a model to open the [node details](#node-detail) panel and perform other data validation checks.
43
42
44
43
### Filter Nodes
44
+
45
45
In the top control bar, you can change the rule to filter the nodes:
46
46
47
47
1.**Mode:**
48
-
-**Changed Models:** Modified nodes and their downstream + 1st degree of their parents.
49
-
-**All:** Show all nodes.
48
+
-**Changed Models:** Modified nodes and their downstream + 1st degree of their parents.
49
+
-**All:** Show all nodes.
50
50
1.**Package:** Filter by dbt package names.
51
51
1.**Select:** Select nodes by [node selection](./node-selection.md).
52
52
1.**Exclude:** Exclude nodes by [node selection](./node-selection.md).
53
53
54
54
### Select Nodes
55
55
56
-
Click a node to select it, or click the **Select nodes** button at the top-right corner to select multiple nodes for further operations. For detail, see the [Multi Nodes Selections](#multi-nodes-selection) section
56
+
Click a node to select it, or click the **Select nodes** button at the top-right corner to select multiple nodes for further operations. For detail, see the [Multi Nodes Selections](#multi-nodes-selection) section
57
57
58
58
### Row Count Diff
59
59
60
60
A row count diff can be performed on nodes selected using the `select` and `exclude` options:
1. Clicking the 3 dots (**...**) button at the top-right corner.
68
67
2. Clicking **Row Count Diff by Selector**.
69
68
70
-
71
69
## Node Details
72
70
73
-
The node details panel shows information about a node, such as node type, schema and row count changes, and allows you to perform diffs on the node using the options accessed via the `Explore Change` button.
71
+
The node details panel shows information about a node, such as node type, schema and row count changes, and allows you to perform diffs on the node using the options accessed via the `Explore Change` button.
74
72
75
73
### Schema Diff
76
74
77
75
Schema Diff shows added, removed, and renamed columns. Click a model in the Lineage Diff to open the node details and view the Schema Diff.
78
76
79
77
!!! Note
80
-
Schema Diff requires `catalog.json` in both environments.
81
-
78
+
Schema Diff requires `catalog.json` in both environments.
Row Count Diff shows the difference in row count between the base and current environments.
@@ -121,7 +116,6 @@ Value Diff shows the matched count and percentage for each column in the table.
121
116
122
117
The primary key is automatically inferred by the first column with the [unique](https://docs.getdbt.com/reference/resource-properties/data-tests#unique) test. If no primary key is detected at least one column is required to be specified as the primary key.
123
118
124
-
125
119
<figuremarkdown>
126
120

127
121
<figcaption>Value Diff</figcaption>
@@ -132,17 +126,6 @@ The primary key is automatically inferred by the first column with the [unique](
132
126
-**Matched**: For a column, the count of matched value of common PKs.
133
127
-**Matched %**: For a column, the ratio of matched over common PKs.
134
128
135
-
!!! note
136
-
137
-
Value Diff uses the `compare_column_values` from [audit-helper](https://hub.getdbt.com/dbt-labs/audit_helper/latest/). To use Value Diff, ensure that `audit-helper` is installed in your project.
138
-
139
-
```yaml
140
-
packages:
141
-
- package: dbt-labs/audit_helper
142
-
version: <version>
143
-
```
144
-
145
-
146
129
View mismatched values at the row level by clicking the `show mismatched values` option on a column name:
Please refer to the [dbt-profiler](https://hub.getdbt.com/data-mie/dbt_profiler/latest/#dbt-profiler) documentation for the definitions of profiling stats.
165
-
166
-
!!! Note
167
-
Profile diff uses the `get_profile` from [dbt-profiler](https://hub.getdbt.com/data-mie/dbt_profiler/latest/). To use Profile Diff, ensure that dbt-profiler is installed in your project.
146
+
The Statistics:
168
147
169
-
```yaml
170
-
packages:
171
-
- package: data-mie/dbt_profiler
172
-
version: <version>
173
-
```
148
+
- Row count
149
+
- Not null proportion
150
+
- Distinct proportion
151
+
- Distinct count
152
+
- Is unique
153
+
- Minimum
154
+
- Maximum
155
+
- Average
156
+
- Median
174
157
175
158
### Histogram Diff
176
159
177
-
Histogram Diff compares the distribution of a numeric column in an overlay histogram chart.
160
+
Histogram Diff compares the distribution of a numeric column in an overlay histogram chart.
@@ -185,7 +168,6 @@ A Histogram Diff can be generated in two ways.
185
168
186
169
**Via the Explore Change button menu:**
187
170
188
-
189
171
1. Select the model from the Lineage DAG.
190
172
2. Click the `Explore Change` button.
191
173
3. Click `Histogram Diff`.
@@ -199,13 +181,11 @@ A Histogram Diff can be generated in two ways.
199
181
3. Click the vertical 3 dots `...`
200
182
4. Click `Histogram Diff`.
201
183
202
-
203
184
<figuremarkdown>
204
185
{: .shadow}
205
186
<figcaption>Generate a Recce Histogram Diff from the column options</figcaption>
206
187
</figure>
207
188
208
-
209
189
### Top-K Diff
210
190
211
191
Top-K Diff compares the distribution of a categorical column. The top 10 elements are shown by default, which can be expanded to the top 50 elements.
@@ -215,12 +195,10 @@ Top-K Diff compares the distribution of a categorical column. The top 10 element
215
195
<figcaption>Recce Top-K Diff</figcaption>
216
196
</figure>
217
197
218
-
219
198
A Top-K Diff can be generated in two ways.
220
199
221
200
**Via the Explore Change button menu:**
222
201
223
-
224
202
1. Select the model from the Lineage DAG.
225
203
2. Click the `Explore Change` button.
226
204
3. Click `Top-K Diff`.
@@ -234,14 +212,11 @@ A Top-K Diff can be generated in two ways.
234
212
3. Click the vertical 3 dots `...`
235
213
4. Click `Top-K Diff`.
236
214
237
-
238
215
<figuremarkdown>
239
216
{: .shadow}
240
217
<figcaption>Generate a Recce Top-K Diff </figcaption>
241
218
</figure>
242
219
243
-
244
-
245
220
## Multi-Node Selection
246
221
247
222
Multiple nodes can be selected in the Lineage DAG. This enables actions to be performed on multiple nodes at the same time such as Row Count Diff, or Value Diff.
@@ -295,8 +270,6 @@ An example of selecting multiple nodes to perform a multi-node Value Diff:
295
270
<figcaption>Perform a Value Diff on multiple nodes</figcaption>
296
271
</figure>
297
272
298
-
299
-
300
273
## Screenshot
301
274
302
275
In the diff result, we can find a **Copy to Clipboard** button. it's a handy feature to copy the result image to clipboard and paste in your PR comment.
@@ -339,7 +312,6 @@ For the majority of diffs, which are performed via the Explore Change dropdown m
339
312
<figcaption>Add a Check by clicking the Add to Checklist button in the diff results panel</figcaption>
340
313
</figure>
341
314
342
-
343
315
An example performing a Top-K diff and adding the results to the Checklist:
Copy file name to clipboardExpand all lines: docs/installation.md
-15Lines changed: 0 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,18 +11,3 @@ Install `Recce` in your dbt project with pip:
11
11
pip install recce
12
12
```
13
13
14
-
To take full advantage of all the features of `Recce`, ensure that [dbt_profiler](https://hub.getdbt.com/data-mie/dbt_profiler/latest/) and [audit-helper](https://hub.getdbt.com/dbt-labs/audit_helper/latest/) are installed via the `packages.yml` file in your dbt project .
15
-
16
-
1. Add these two packages in the packages.yml
17
-
2. Do `dbt deps` to install these 2 packages.
18
-
19
-
```yaml
20
-
packages:
21
-
- package: dbt-labs/audit_helper
22
-
version: <version>
23
-
- package: data-mie/dbt_profiler
24
-
version: <version>
25
-
26
-
```
27
-
28
-
For full instructions on using `Recce`, check the [Getting Started](get-started.md) guide.
Copy file name to clipboardExpand all lines: docs/recce-cloud/getting-started-recce-cloud.md
+1-13Lines changed: 1 addition & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,18 +77,6 @@ Set up the Jaffle Shop project and install Recce.
77
77
+ schema: prod
78
78
+ threads: 24
79
79
```
80
-
1. Add the following packages required by Recce for some features (highly recommended). Create a `./packages.yml` file in the root of your project with the following packages:
81
-
```
82
-
packages:
83
-
- package: dbt-labs/audit_helper
84
-
version: 0.12.0
85
-
- package: data-mie/dbt_profiler
86
-
version: 0.8.2
87
-
```
88
-
Install the packages:
89
-
```
90
-
dbt deps
91
-
```
92
80
93
81
## Prepare the base environment
94
82
@@ -256,4 +244,4 @@ Back on the GitHub PR page, you'll notice that the Recce Cloud check status has
256
244
{: .shadow}
257
245
258
246
259
-
In a real-world situation you'd now be able to merge the PR with the confidence that the PR author had checked their work, and the reviewer both understands and has signed-off on any changes.
247
+
In a real-world situation you'd now be able to merge the PR with the confidence that the PR author had checked their work, and the reviewer both understands and has signed-off on any changes.
0 commit comments