Skip to content

Commit 8be29be

Browse files
authored
Merge branch 'main' into chore/release-action
2 parents 90d6df2 + b39df5b commit 8be29be

File tree

141 files changed

+110817
-288
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

141 files changed

+110817
-288
lines changed

.bumpversion.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 0.0.20
2+
current_version = 0.1.0
33
commit = True
44
tag = True
55

.github/meta/.keep

Whitespace-only changes.

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ pyvenv*/
3333
sdist
3434
var
3535
venv*/
36+
.venv/
3637
wheelhouse
3738

3839
# Installer logs
@@ -62,6 +63,7 @@ nosetests.xml
6263
.project
6364
.pydevproject
6465
.vscode
66+
.github/meta/
6567

6668
# Complexity
6769
output/*.html

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# pre-commit install --install-hooks
33
# To update the versions:
44
# pre-commit autoupdate
5-
exclude: '^(\.tox|ci/templates|\.bumpversion\.cfg)(/|$)'
5+
exclude: '^(\.tox|ci/templates|\.bumpversion\.cfg|src/vendor)(/|$)'
66
# Note the order is intentional to avoid multiple passes of the hooks
77
repos:
88
- repo: https://github.com/astral-sh/ruff-pre-commit

.pre-commit-hooks.yaml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,16 @@
11
- id: datapilot_run_dbt_checks
22
name: datapilot run dbt checks
3-
description: datapilot run dbt checks
3+
description: Run DataPilot dbt project health checks on changed files
44
entry: datapilot_run_dbt_checks
55
language: python
66
types_or: [yaml, sql]
77
require_serial: true
8+
# Optional arguments that can be passed to the hook:
9+
# --config-path: Path to configuration file
10+
# --token: API token for authentication
11+
# --instance-name: Tenant/instance name
12+
# --backend-url: Backend URL (defaults to https://api.myaltimate.com)
13+
# --config-name: Name of config to use from API
14+
# --manifest-path: Path to DBT manifest file (defaults to ./target/manifest.json)
15+
# --catalog-path: Path to DBT catalog file (defaults to ./target/catalog.json)
16+
# --base-path: Base path of the dbt project (defaults to current directory)

README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,40 @@ The [--config-path] is an optional argument. You can provide a yaml file with ov
4242

4343
Note: The dbt docs generate requires an active database connection and may take a long time for projects with large number of models.
4444

45+
### Pre-commit Hook Integration
46+
47+
DataPilot provides a pre-commit hook that automatically runs health checks on changed files before each commit. This ensures code quality and catches issues early in the development process.
48+
49+
#### Quick Setup
50+
51+
1. Install pre-commit:
52+
```bash
53+
pip install pre-commit
54+
```
55+
56+
2. Add to your `.pre-commit-config.yaml`:
57+
```yaml
58+
repos:
59+
- repo: https://github.com/AltimateAI/datapilot-cli
60+
rev: v0.0.27 # Always use a specific version tag
61+
hooks:
62+
- id: datapilot_run_dbt_checks
63+
args: [
64+
"--config-path", "./datapilot-config.yaml",
65+
"--token", "${DATAPILOT_TOKEN}",
66+
"--instance-name", "${DATAPILOT_INSTANCE}",
67+
"--manifest-path", "./target/manifest.json",
68+
"--catalog-path", "./target/catalog.json"
69+
]
70+
```
71+
72+
3. Install the hook:
73+
```bash
74+
pre-commit install
75+
```
76+
77+
For detailed setup instructions, see the [Pre-commit Hook Setup Guide](docs/pre-commit-setup.md).
78+
4579
### Checks
4680

4781
The following checks are available:

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
year = "2024"
1616
author = "Altimate Inc."
1717
copyright = f"{year}, {author}"
18-
version = release = "0.0.20"
18+
version = release = "0.1.0"
1919

2020
pygments_style = "trac"
2121
templates_path = ["."]

docs/hooks.rst

Lines changed: 150 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,37 +11,171 @@ To use the DataPilot pre-commit hook, follow these steps:
1111

1212
1. Install the `pre-commit` package if you haven't already:
1313

14-
```
15-
pip install pre-commit
16-
```
14+
.. code-block:: shell
15+
16+
pip install pre-commit
1717
1818
2. Add the following configuration to your .pre-commit-config.yaml file in the root of your repository:
1919

20-
```
20+
.. code-block:: yaml
21+
2122
repos:
22-
- repo: https://github.com/AltimateAI/datapilot-cli
23-
rev: <revision>
24-
hooks:
25-
- id: datapilot_run_dbt_checks
26-
args: ["--config-path", "path/to/your/config/file"]
27-
```
23+
- repo: https://github.com/AltimateAI/datapilot-cli
24+
rev: v0.0.27 # Use a specific version tag, not 'main'
25+
hooks:
26+
- id: datapilot_run_dbt_checks
27+
args: [
28+
"--config-path", "./datapilot-config.yaml",
29+
"--token", "${DATAPILOT_TOKEN}",
30+
"--instance-name", "${DATAPILOT_INSTANCE}"
31+
]
32+
33+
Configuration Options
34+
---------------------
35+
36+
The DataPilot pre-commit hook supports several configuration options:
37+
38+
**Required Configuration:**
39+
40+
- ``rev``: Always use a specific version tag (e.g., ``v0.0.27``) instead of ``main`` for production stability
41+
42+
**Optional Arguments:**
43+
44+
- ``--config-path``: Path to your DataPilot configuration file
45+
- ``--token``: Your API token for authentication (can use environment variables)
46+
- ``--instance-name``: Your tenant/instance name (can use environment variables)
47+
- ``--backend-url``: Backend URL (defaults to https://api.myaltimate.com)
48+
- ``--config-name``: Name of config to use from API
49+
- ``--base-path``: Base path of the dbt project (defaults to current directory)
50+
- ``--manifest-path``: Path to the DBT manifest file (defaults to {base_path}/target/manifest.json)
51+
- ``--catalog-path``: Path to the DBT catalog file (defaults to {base_path}/target/catalog.json)
2852

29-
Replace <revision> with the desired revision of the DataPilot repository and "path/to/your/config/file" with the path to your configuration file.
53+
**Environment Variables:**
54+
55+
You can use environment variables for sensitive information:
56+
57+
.. code-block:: yaml
58+
59+
repos:
60+
- repo: https://github.com/AltimateAI/datapilot-cli
61+
rev: v0.0.27
62+
hooks:
63+
- id: datapilot_run_dbt_checks
64+
args: [
65+
"--config-path", "./datapilot-config.yaml",
66+
"--token", "${DATAPILOT_TOKEN}",
67+
"--instance-name", "${DATAPILOT_INSTANCE}",
68+
"--manifest-path", "./target/manifest.json",
69+
"--catalog-path", "./target/catalog.json"
70+
]
71+
72+
**Configuration File Example:**
73+
74+
Create a ``datapilot-config.yaml`` file in your project root:
75+
76+
.. code-block:: yaml
77+
78+
# DataPilot Configuration
79+
disabled_insights:
80+
- "hard_coded_references"
81+
- "duplicate_sources"
82+
83+
# Custom settings for your project
84+
project_settings:
85+
max_fanout: 10
86+
require_tests: true
3087
3188
3. Install the pre-commit hook:
3289

33-
```
34-
pre-commit install
35-
```
90+
.. code-block:: shell
91+
92+
pre-commit install
3693
3794
Usage
3895
-----
3996

40-
Once the hook is installed, it will run automatically before each commit. If any issues are detected, the commit will be aborted, and you will be prompted to fix the issues before retrying the commit.
97+
Once the hook is installed, it will run automatically before each commit. The hook will:
98+
99+
1. **Validate Configuration**: Check that your config file exists and is valid
100+
2. **Authenticate**: Use your provided token and instance name to authenticate
101+
3. **Load DBT Artifacts**: Load manifest and catalog files for analysis
102+
4. **Analyze Changes**: Only analyze files that have changed in the commit
103+
5. **Report Issues**: Display any issues found and prevent the commit if problems are detected
104+
105+
**Required DBT Artifacts:**
106+
107+
The pre-commit hook requires DBT manifest and catalog files to function properly:
41108

109+
- **Manifest File**: Generated by running `dbt compile` or `dbt run`. Default location: `./target/manifest.json`
110+
- **Catalog File**: Generated by running `dbt docs generate`. Default location: `./target/catalog.json`
42111

43-
If you want to manually run all pre-commit hooks on a repository, run `pre-commit run --all-files`. To run individual hooks use `pre-commit run <hook_id>`.
112+
**Note**: The catalog file is optional but recommended for comprehensive analysis. If not available, the hook will continue without catalog information.
44113

114+
**Manual Execution:**
115+
116+
To manually run all pre-commit hooks on a repository:
117+
118+
.. code-block:: shell
119+
120+
pre-commit run --all-files
121+
122+
To run individual hooks:
123+
124+
.. code-block:: shell
125+
126+
pre-commit run datapilot_run_dbt_checks
127+
128+
**Troubleshooting:**
129+
130+
- **Authentication Issues**: Ensure your token and instance name are correctly set
131+
- **Empty Config Files**: The hook will fail if your config file is empty or invalid
132+
- **Missing Manifest File**: Ensure you have run `dbt compile` or `dbt run` to generate the manifest.json file
133+
- **Missing Catalog File**: Run `dbt docs generate` to create the catalog.json file (optional but recommended)
134+
- **No Changes**: If no relevant files have changed, the hook will skip execution
135+
- **Network Issues**: Ensure you have access to the DataPilot API
136+
137+
Best Practices
138+
-------------
139+
140+
1. **Use Version Tags**: Always specify a version tag in the ``rev`` field, never use ``main``
141+
2. **Environment Variables**: Use environment variables for sensitive information like tokens
142+
3. **Configuration Files**: Create a dedicated config file for your project settings
143+
4. **Regular Updates**: Update to new versions when they become available
144+
5. **Team Coordination**: Ensure all team members use the same configuration
145+
146+
Example Complete Setup
147+
---------------------
148+
149+
Here's a complete example of a ``.pre-commit-config.yaml`` file:
150+
151+
.. code-block:: yaml
152+
153+
# .pre-commit-config.yaml
154+
exclude: '^(\.tox|ci/templates|\.bumpversion\.cfg)(/|$)'
155+
156+
repos:
157+
- repo: https://github.com/astral-sh/ruff-pre-commit
158+
rev: v0.1.14
159+
hooks:
160+
- id: ruff
161+
args: [--fix, --exit-non-zero-on-fix, --show-fixes]
162+
163+
- repo: https://github.com/psf/black
164+
rev: 23.12.1
165+
hooks:
166+
- id: black
167+
168+
- repo: https://github.com/AltimateAI/datapilot-cli
169+
rev: v0.0.27
170+
hooks:
171+
- id: datapilot_run_dbt_checks
172+
args: [
173+
"--config-path", "./datapilot-config.yaml",
174+
"--token", "${DATAPILOT_TOKEN}",
175+
"--instance-name", "${DATAPILOT_INSTANCE}",
176+
"--manifest-path", "./target/manifest.json",
177+
"--catalog-path", "./target/catalog.json"
178+
]
45179
46180
Feedback and Contributions
47181
--------------------------

docs/performance.rst

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,29 @@
11
===============================
2-
Performance of Pre-commit Hook
2+
Pre-commit Hook
33
===============================
44

55
Overview
66
--------
7+
You can use the pre-commit hook to run DataPilot checks on your dbt project. Do this by following the instructions below:
8+
9+
1. Install pre-commit (https://pre-commit.com/#install)
10+
2. Create a file in the root of the project ``.pre-commit-config.yaml``
11+
3. Add to that file this below:
12+
4. Run ``pre-commit run --all-files`` to test it out
13+
14+
.. code-block:: yaml
15+
16+
repos:
17+
- repo: https://github.com/AltimateAI/datapilot-cli
18+
rev: v0.0.27
19+
hooks:
20+
- id: datapilot_run_dbt_checks
21+
args: [--config-name, "[The Config Name in the SaaS UI]", --token, "[YOUR API KEY]", --instance-name, "[Your tenant name]"]
22+
23+
24+
Performance
25+
-----------
26+
727
The primary objective is to ensure the pre-commit hook operates swiftly and efficiently, preventing any delay in the development workflow. To achieve this, various optimizations have been applied, focusing on minimizing the time and resources required during execution.
828

929
Optimizations

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ def read(*names, **kwargs):
1313

1414
setup(
1515
name="altimate-datapilot-cli",
16-
version="0.0.20",
16+
version="0.1.0",
1717
license="MIT",
1818
description="Assistant for Data Teams",
1919
long_description="{}\n{}".format(
@@ -63,7 +63,7 @@ def read(*names, **kwargs):
6363
python_requires=">=3.8",
6464
install_requires=[
6565
"click~=8.1.7",
66-
"dbt-artifacts-parser~=0.8.1",
66+
"pydantic >=2.0,<3.0",
6767
"ruamel.yaml~=0.18.6",
6868
"tabulate~=0.9.0",
6969
"requests>=2.31",

0 commit comments

Comments
 (0)