Skip to content

Commit 5f3e9d3

Browse files
committed
adding in grammatical or rephrasing changes to sections 3-8
1 parent c911aba commit 5f3e9d3

File tree

14 files changed

+69
-69
lines changed

14 files changed

+69
-69
lines changed

docs/3-visualized-change/column-level-lineage.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: Column-Level Lineage
44

55
Column-Level Lineage provides visibility into the upstream and downstream relationships of a column.
66

7-
Common use-cases for column-level lineage are
7+
Common use-cases for column-level lineage are:
88

99
1. **Source Exploration**: During development, column-level lineage helps you understand how a column is derived.
1010
2. **Impact Analysis**: When modifying the logic of a column, column-level lineage enables you to assess the potential impact across the entire DAG.
@@ -33,7 +33,7 @@ The transformation type is also displayed for each column, which will help you u
3333
| Pass-through |The column is directly selected from the upstream table. |
3434
| Renamed | The column is selected from the upstream table but with a different name. |
3535
| Derived | The column is created through transformations applied to upstream columns, such as calculations, conditions, functions, or aggregations. |
36-
| Source | The column is not derived from any upstream data. It may originate from a seed/source node, literal value, or data generation function. |
37-
| Unknown | We have no information about the transformation type. This could be due to a parse error, or other unknown reason. |
36+
| Source | The column is not derived from any upstream data. It may originate from a seed/source node, literal value or data generation function. |
37+
| Unknown | We have no information about the transformation type. This could be due to a parse error or other unknown reason. |
3838

3939

docs/3-visualized-change/multi-models.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ To select multiple models individually, click the checkbox on the models you wis
2020

2121
To select a node and all of its parents or children:
2222

23-
1. Click the checkbox on the node.
24-
2. Right-click the node.
25-
3. Click to select either parent or child models.
23+
1. Click the checkbox on the node
24+
2. Right-click the node
25+
3. Click to select either parent or child models
2626

2727
<figure markdown>
2828
![Select a node and its parents or children](../assets/images/3-visualized-change/select-node-children.gif){: .shadow}
@@ -88,7 +88,7 @@ Since Recce uses dbt's built-in node selector, it supports most of the selecting
8888

8989
### Use `state` method
9090

91-
In dbt, you need to specify the `--state` option in the CLI. In Recce, we use the base environment as the state, allowing you to use the selector on the fly.
91+
In dbt, you need to specify the `--state` option in the CLI. In Recce we use the base environment as the state, allowing you to use the selector on the fly.
9292

9393

9494
### Removed models
@@ -97,7 +97,7 @@ Another difference is that in dbt, you cannot select removed models. However, in
9797

9898
## Supported Diff
9999

100-
In addition to lineage diff, other types of diff also support node selection. You can find these features in the **...** button in the top right corner. Currently supported diffs include:
100+
In addition to lineage diff, other types of diff also support node selection. You can find these features in the **...** button in the top right corner. Currently supported node-based diffs include:
101101

102102
- Lineage diff
103103
- Row count diff

docs/4-downstream-impacts/impact-radius.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,8 +134,8 @@ Two core features power the impact radius analysis:
134134
With the insights from the two features above, Recce determines the impact radius:
135135

136136
1. If a model has a **breaking change**, include all downstream models in the impact radius.
137-
1. If a model has a **non-breaking change**, include only the downstream columns and models of newly added columns.
138-
1. If a model has a **partial breaking change**, include the downstream columns and models of added, removed, or modified columns.
137+
2. If a model has a **non-breaking change**, include only the downstream columns and models of newly added columns.
138+
3. If a model has a **partial breaking change**, include the downstream columns and models of added, removed, or modified columns.
139139

140140

141141

docs/5-data-diffing/query.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,11 +53,11 @@ In the current version, Recce provides two ways to compare the query result betw
5353

5454
**Query diff occurs in the client side:**
5555

56-
Without providing primary key(s) upfront, AdHoc query compare in the client side. That is, Recce fetches the first 2,000 rows and compare in the client side. The advantage is it has more flexibility to query sql for no PK, especially when column structures differ or no clear primary key exists.
56+
Without primary keys provided upfront, adhoc queries will compare results on the client side. That is, Recce fetches the first 2,000 rows and compare in the client side. The advantage is it has more flexibility to query sql for no PK, especially when column structures differ or no clear primary key exists.
5757
However, the limitation is that we cannot find the mismatched rows in a big query result.
5858

5959
**Query diff occurs in the warehouse:**
6060

61-
With primary key(s) given, it can perform a query diff in the warehouse. It only displays changed, added, or removed rows. Therefore, if only one record is different among a million, that specific record will be visible. Hence, it also reduces the amount of data transferred.
61+
When primary keys are given, it can perform a query diff in the warehouse. It will only display changed, added, or removed rows. Meaning, if only one record is different among a million, that specific record will be visible. Thus reducing the amount of data transferred.
6262

6363
Another similar feature is [Value Diff](lineage.md#value-diff). Value diff is based on a chosen model, so you don’t need to write SQL to operate it, though it naturally offers less flexibility. Additionally, value diff can show a summary or actual diff records, whereas query diff only shows the actual diff records.

docs/5-data-diffing/value-diff.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ title: Value Diff
55

66
Value Diff shows the matched count and percentage for each column in the table. It uses the primary key(s) to uniquely identify the records between the model in both environments.
77

8-
The primary key is automatically inferred by the first column with the [unique](https://docs.getdbt.com/reference/resource-properties/data-tests#unique) test. If no primary key is detected at least one column is required to be specified as the primary key.
8+
The primary key (PK) is automatically inferred by the first column with the [unique](https://docs.getdbt.com/reference/resource-properties/data-tests#unique) test. If no primary key is detected at least one column is required to be specified as the primary key.
99

1010
<figure markdown>
1111
![Recce Value Diff](../assets/images/5-data-diffing/value-diff.png)

docs/6-collaboration/checklist.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@ An example performing a Top-K diff and adding the results to the Checklist:
3434

3535
The Recce Checklist provides a way to record the results of a data check during change exploration. The purpose of adding Checks to the Checklist is to enable you to:
3636

37-
- Save Checks with notes of your interpretation of the data.
38-
- Re-run checks following further data modeling changes.
39-
- Share Checks as part of PR or stakeholder review.
37+
- Save Checks with notes of your interpretation of the data
38+
- Re-run checks following further data modeling changes
39+
- Share Checks as part of PR or stakeholder review
4040

4141
## Preset Check
4242

docs/7-cicd/best-practices-prep-env.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ Recce is designed to compare two environments in your data project. To use it ef
66

77
However, there are many challenges in preparing environments.
88

9-
1. Your **source data** might be continuously updating.
10-
2. Your transformations might be **time-consuming**.
11-
3. The base branch may have **other PRs merged** at any time.
12-
4. The generated environment will leave data in the warehouse, which also needs to be properly managed.
9+
1. Your **source data** might be continuously updating
10+
2. Your transformations might be **time-consuming**
11+
3. The base branch may have **other PRs merged** at any time
12+
4. The generated environment will leave data in the warehouse, which also needs to be properly managed
1313

14-
This article will not focus on how to use Recce but rather on how to effectively prepare environments for Recce use.
14+
This article will not focus on how to use Recce, but rather on how to effectively prepare environments for Recce use.
1515

1616

1717
## Best Practices
@@ -88,9 +88,9 @@ Using the production environment as the base environment is a straightforward ch
8888

8989
This staging environment can have the following characteristics:
9090

91-
1. Ensure that the transformed results reflect the **latest commit** of the base branch.
92-
2. Use the **same source data** as the PR environment.
93-
3. Use the **same transformation logic** as the PR environment.
91+
1. Ensure that the transformed results reflect the **latest commit** of the base branch
92+
2. Use the **same source data** as the PR environment
93+
3. Use the **same transformation logic** as the PR environment
9494

9595
The basic principle is that the staging environment's configuration should be **as close as possible to the PR environments**, except for using a different git commit.
9696

@@ -168,9 +168,9 @@ Recce relies on the base and current environment artifacts to find the correspon
168168
169169
Here are a few methods you can choose:
170170
171-
1. In CI, upload the generated artifact to the cloud storage (e.g., AWS S3).
172-
2. For dbt Cloud users, you can [download artifacts](https://docs.getdbt.com/dbt-cloud/api-v2#/operations/Retrieve%20Run%20Artifact) for the latest run of a given job.
173-
3. For GitHub Actions users, you can use the GitHub CLI (gh) to [download artifacts](https://cli.github.com/manual/gh_run_download) for the latest run of a given workflow.
171+
1. In CI, upload the generated artifact to the cloud storage (e.g., AWS S3)
172+
2. For dbt Cloud users, you can [download artifacts](https://docs.getdbt.com/dbt-cloud/api-v2#/operations/Retrieve%20Run%20Artifact) for the latest run of a given job
173+
3. For GitHub Actions users, you can use the GitHub CLI (gh) to [download artifacts](https://cli.github.com/manual/gh_run_download) for the latest run of a given workflow
174174
175175
If the methods mentioned above are too complex, a stateless approach is to directly check out the base branch and run **`dbt docs generate`** to generate the artifacts.
176176

@@ -205,8 +205,8 @@ dbt run-operation clear_schema --args "{'schema_name': 'pr_123'}"
205205
| PR | `pr_<number>` | On Push | # of opened PR | 1 month, excluding this week |
206206

207207

208-
- Automate environment generation using GitHub Actions.
209-
- PR Environment will only be generated automatically when the PR is up-to-date.
210-
- Artifacts will be stored under the workflow’s artifacts.
211-
- PR environments are removed on PR closed.
212-
- Use staging environment as the base environment for Recce.
208+
- Automate environment generation using GitHub Actions
209+
- PR Environment will only be generated automatically when the PR is up-to-date
210+
- Artifacts will be stored under the workflow’s artifacts
211+
- PR environments are removed on PR closed
212+
- Use staging environment as the base environment for Recce

docs/7-cicd/index.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ icon: material/hand-wave-outline
44
---
55

66
## What is Recce Cloud?
7-
Recce Cloud is a data collaboration platform for teams doing data validation, impact analysis, and pull requests reviews. It helps data teams catch issues early, understand downstream impacts, and communicate changes clearly—all in one shared workspace. Instead of working alone in a local dev environment, teams can explore lineage, run custom queries, and validate metrics together, speeding up reviews and building trust across stakeholders.
7+
Recce Cloud is a data collaboration platform for teams doing data validation, impact analysis, and pull requests reviews. It helps data teams catch issues early, understand downstream impacts, and communicate changes clearly in one shared workspace. Instead of working in an isolated local environment, teams can explore lineage, run custom queries, and validate metrics together in a cloud-hosted environment.
88

99
- [Learn more about different plans](https://reccehq.com/pricing)
1010
- Follow the [Getting Started](/get-started/) guide
@@ -14,9 +14,9 @@ Recce Cloud integrates with GitHub to support validation in your PR workflow. Th
1414

1515
### Prerequisite
1616
1. Sign in [Recce cloud](https://cloud.reccehq.com/)
17-
2. Click **Install** button to install Recce Cloud GitHub app to your personal or organization account.
18-
3. Authorize the repositories to the GitHub app.
19-
4. Prepare the GitHub personal access token with the `repo` permission. Please see the [GitHub document](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens). And set it to your environment variable.
17+
2. Click **Install** button to install Recce Cloud GitHub app to your personal or organization account
18+
3. Authorize the repositories to the GitHub app
19+
4. Prepare the GitHub personal access token with the `repo` permission. Please see the [GitHub document](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens). And set it to your environment variable
2020
```
2121
export GITHUB_TOKEN=<token>
2222
```
@@ -36,8 +36,8 @@ Recce Cloud integrates with GitHub to support validation in your PR workflow. Th
3636
git checkout -b <my-awesome-feature>
3737
```
3838
1. Develop your features and prepare the dbt artifacts for the base (`target-base/`) and current (`target/`) environments.
39-
1. Create a pull request for this branch. Recce Cloud requires an open pull request in your GitHub repository. It also stores the latest Recce state for each pull request.
40-
1. Launch the Recce instance in the cloud mode. It will use the dbt artifacts in the local `target` and `target-base` and initiate a new review state if necessary.
39+
2. Create a pull request for this branch. Recce Cloud requires an open pull request in your GitHub repository. It also stores the latest Recce state for each pull request.
40+
3. Launch the Recce instance in the cloud mode. It will use the dbt artifacts in the local `target` and `target-base` and initiate a new review state if necessary.
4141
```
4242
recce server --cloud
4343
```
@@ -51,7 +51,7 @@ Recce Cloud integrates with GitHub to support validation in your PR workflow. Th
5151
If the review state is already available for this PR, you can open the Recce instance to review.
5252

5353
1. Checkout the branch for the reviewed PR.
54-
1. Launch the Recce instance to review this PR
54+
2. Launch the Recce instance to review this PR
5555
```
5656
recce server --review --cloud
5757
```
@@ -105,14 +105,14 @@ recce summary --cloud > summary.md
105105

106106
### Recce cloud
107107

108-
The cloud subcommand in recce provides functionality for managing state files in cloud storage.
108+
The cloud subcommand in Recce provides functionality for managing state files in cloud storage.
109109

110110
#### purge
111111

112112
You can purge the state from your current PR. It is useful when
113113

114114
1. You forgot the password
115-
1. You would like to reset the state of this PR.
115+
1. You would like to reset the state of this PR
116116

117117
```shell
118118
git checkout <pr-branch>
@@ -139,6 +139,6 @@ recce cloud download
139139

140140
## GitHub Pull Request Status Check
141141

142-
Recce Cloud integrate with the [GitHub Pull Request Status Check](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/collaborating-on-repositories-with-code-quality-features/about-status-checks). If there is recce review state synced to a PR, the PR would have a recce cloud check status. Once all checks in recce are approved, the check status would change to passed and ready to be merged.
142+
Recce Cloud integrate with the [GitHub Pull Request Status Check](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/collaborating-on-repositories-with-code-quality-features/about-status-checks). If there is Recce review state synced to a PR, the PR would have a Recce cloud check status. Once all checks in Recce are approved, the check status would change to passed and ready to be merged.
143143

144144
![alt text](../assets/images/recce-cloud/pr-checks-all-approved.png){: .shadow}

docs/7-cicd/preset-checks.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ To configure the preset checks, add the settings to the [recce config file](../8
1313

1414
1. Add a check to your checklist
1515
![alt text](../assets/images/7-cicd/preset-checks-prep.png){: .shadow}
16-
2. Open the menu for the check and select **Get Preset Check Template**.
16+
2. Open the menu for the check and select **Get Preset Check Template**
1717
3. Copy the yaml config from the dialog
1818
![alt text](../assets/images/7-cicd/preset-checks-template.png){: .shadow}
1919

@@ -40,9 +40,9 @@ To configure the preset checks, add the settings to the [recce config file](../8
4040
4141
### Recce Server
4242
43-
1. When a new Recce instance is launched, all preset checks are automatically set up, but these checks are not executed at this time.
43+
1. When a new Recce instance is launched, all preset checks are automatically set up, but these checks are not executed at this time
4444
![alt text](../assets/images/7-cicd/preset-checks.png){: .shadow}
45-
2. When the **Run Query** button is pressed, the check will be executed.
45+
2. When the **Run Query** button is pressed, the check will be executed
4646
4747
### Recce Run
4848
@@ -69,7 +69,7 @@ To configure the preset checks, add the settings to the [recce config file](../8
6969
```shell
7070
recce server recce_state.json
7171
```
72-
3. You can show the summary of the state by the [recce summary](./recce-summary.md) command
72+
3. You can show the summary of the state by the [recce summary](./recce-summary.md) command.
7373
```shell
7474
recce summary recce_state.json
7575
```

docs/7-cicd/recce-summary.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Summary
33
---
44

5-
Recce `Summary` command is used to generate a summary based on the input state file. In the previous section, the `Run` command was used to generate a state file based on the two environments. It provides a way to integrate Recce into your CI/CD pipeline. The `Summary` command is used to generate a summary based on the output of `Run` command. You can also integrate the `Summary` command into your CI/CD pipeline to generate a summary based on the state file generated by the `Run` command. Therefor, the generated summary can be posted to your repository hosting platform, such as GitHub, GitLab, or Bitbucket.
5+
Recce `Summary` command is used to generate a summary based on the input state file. In the previous section, the `Run` command was used to generate a state file based on the two environments. It provides a way to integrate Recce into your CI/CD pipeline. The `Summary` command is used to generate a summary based on the output of `Run` command. You can also integrate the `Summary` command into your CI/CD pipeline to generate a summary based on the state file generated by the `Run` command. Thus allowing the generated summary can be posted to your repository hosting platform, such as GitHub, GitLab, or Bitbucket.
66

77
## Usage
88

@@ -18,7 +18,7 @@ recce summary recce-state.json
1818

1919
## Output
2020

21-
The output of the `summary` command will be Markdown format. The markdown output will contain the following sections:
21+
The output of the `summary` command will be in markdown format. The markdown output will contain the following sections:
2222

2323
- Lineage Graph - A graph that shows the lineage of the models that are impacted by the modified models.
2424
- Checks Summary - A summary of the checks that are detected mismatch between `base` and `current` environments.

0 commit comments

Comments
 (0)