Skip to content

Conversation

@gaurpulkit
Copy link
Contributor

@gaurpulkit gaurpulkit commented Aug 21, 2025

Important

Enhances DataPilot pre-commit hook with new configuration options, improved execution logic, and updated documentation.

  • Pre-commit Hook Enhancements:
    • Updated .pre-commit-hooks.yaml to include optional arguments for configuration, authentication, and file paths.
    • Enhanced executor_hook.py to validate and load configurations from file or API, handle authentication, and process changed files.
    • Improved logging and error handling in executor_hook.py.
  • Documentation Updates:
    • Added detailed setup instructions for pre-commit hook in README.md and docs/hooks.rst.
    • Included examples of configuration files and environment variable usage.
  • Code Refactoring:
    • Refactored executor_hook.py to modularize functions for argument parsing, configuration loading, and report processing.

This description was created by Ellipsis for 38fadc3. You can customize this summary. It will automatically update as commits are pushed.

…ructions

- Updated the pre-commit hook description for clarity.
- Added validation for the configuration file in the executor hook to ensure it exists and is not empty.
- Expanded README and documentation to include detailed setup instructions for the pre-commit hook, including configuration options and best practices.
- Added functionality to load configuration from both file and API, with detailed error handling.
- Introduced logging statements to provide feedback during the execution of the pre-commit hook.
- Improved handling of changed files and insights generation process, ensuring better visibility into operations.
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 2b985af in 1 minute and 1 seconds. Click for details.
  • Reviewed 552 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .pre-commit-hooks.yaml:1
  • Draft comment:
    Clear description update and added optional args improve hook usability.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. README.md:45
  • Draft comment:
    The new Pre-commit Hook Integration section is comprehensive and well-structured.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
3. docs/hooks.rst:20
  • Draft comment:
    The enhanced hooks documentation provides clear configuration options and examples.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
4. src/datapilot/core/platforms/dbt/hooks/executor_hook.py:42
  • Draft comment:
    If only a single config file is expected, consider using nargs='?' instead of '*' to avoid unintended multiple values.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. src/datapilot/core/platforms/dbt/hooks/executor_hook.py:138
  • Draft comment:
    Authentication parameters (token, instance_name, backend_url) are reassigned here; consider reusing earlier parsed values to reduce redundancy.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
6. src/datapilot/core/platforms/dbt/hooks/executor_hook.py:204
  • Draft comment:
    The assignment of 'changed_files' from args[1] assumes the pre-commit hook passes file names as extra arguments; ensure this behavior is consistent with all use cases.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
7. src/datapilot/core/platforms/dbt/hooks/executor_hook.py:218
  • Draft comment:
    Ensure that DBTInsightGenerator's parameter 'selected_models' (instead of previous 'selected_model_ids') is supported and updated across tests.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
8. src/datapilot/core/platforms/dbt/hooks/executor_hook.py:265
  • Draft comment:
    Consider using Python's logging module instead of multiple print statements for improved maintainability and log level control.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_1zE09eufoWxvTPsO

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

…ocumentation

- Updated the pre-commit hook version to v0.0.27 in configuration files.
- Added new optional arguments for manifest and catalog file paths in the README and documentation.
- Clarified requirements for DBT artifacts in the documentation to improve user guidance.
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed c31fb1d in 1 minute and 17 seconds. Click for details.
  • Reviewed 130 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .pre-commit-hooks.yaml:14
  • Draft comment:
    Document new optional arguments for DBT manifest, catalog, and base path. Ensure the hook implementation uses these defaults consistently.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to document new optional arguments and ensure consistent usage in the hook implementation. This is a request for documentation and consistency, which is not allowed by the rules.
2. README.md:57
  • Draft comment:
    Updated version tag to v0.0.27 and extended the hook configuration with --manifest-path and --catalog-path. Verify consistency with the hook's functionality.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to verify consistency with the hook's functionality, which is against the rules. It doesn't provide a specific suggestion or point out a specific issue with the code.
3. docs/hooks.rst:24
  • Draft comment:
    Updated the revision tag in configuration examples to v0.0.27 for production stability.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, as it only states that a revision tag was updated for production stability. It does not provide any actionable feedback or suggestions for improvement.
4. docs/hooks.rst:47
  • Draft comment:
    Added documentation for the new hook parameters --manifest-path and --catalog-path in the configuration options.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, as it only states that documentation was added for new parameters. It doesn't provide any actionable feedback or suggestions for improvement.
5. docs/hooks.rst:98
  • Draft comment:
    Expanded the manual execution steps to include loading DBT artifacts and added troubleshooting for missing manifest and catalog files.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment seems to be purely informative, describing what was done in the PR without providing any actionable feedback or suggestions. It doesn't ask for clarification or suggest improvements.
6. docs/hooks.rst:168
  • Draft comment:
    Ensure the final pre-commit configuration example includes the updated version tag and the new DBT artifact hook arguments.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is asking the PR author to ensure that the pre-commit configuration example is updated with the new version tag and DBT artifact hook arguments. This falls under the category of asking the author to ensure something is done, which is not allowed according to the rules.

Workflow ID: wflow_MP7NHSYPyWqyCYHZ

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Comment on lines 101 to 103
config_name = args[0].config_name
token = args[0].token
instance_name = args[0].instance_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done multiple times, move it to the top

Comment on lines 112 to 123
matching_configs = [c for c in configs["items"] if c["name"] == config_name]
if matching_configs:
# Get the config directly from the API response
print(f"Using config from API: {config_name} Config ID: {matching_configs[0]['id']}", file=sys.stderr)
config = matching_configs[0].get("config", {})
else:
print(f"No config found with name: {config_name}", file=sys.stderr)
print("Pre-commit hook failed: Config not found.", file=sys.stderr)
sys.exit(1)
else:
print("Failed to fetch configs from API", file=sys.stderr)
print("Pre-commit hook failed: Unable to fetch configs.", file=sys.stderr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it 1 simple function config = get_config(name, matching_configs)


Will become readable

Comment on lines 190 to 195
if hasattr(catalog, "nodes"):
print(f"Catalog loaded successfully with {len(catalog.nodes)} nodes", file=sys.stderr)
elif hasattr(catalog, "get") and callable(catalog.get):
print(f"Catalog loaded successfully with {len(catalog.get('nodes', {}))} nodes", file=sys.stderr)
else:
print(f"Catalog loaded successfully, object type: {type(catalog).__name__}", file=sys.stderr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Load_catalog should just give Catalog object right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for logging to give the users visibility when the pre commit hook is running

Copy link
Collaborator

@suryaiyer95 suryaiyer95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks okay, but code is not readable. Break into smaller functions.

There seems to be a lot of redundant checks as well. Not sure if those are required

- Introduced functions to load configuration from both file and API, enhancing error handling and user feedback.
- Added new command-line arguments for better flexibility in specifying paths and configurations.
- Streamlined the process of loading manifest and catalog files, with improved logging for insights generation.
- Enhanced the handling of changed files for selective model testing, ensuring clarity in output and error messages.
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 3c891cb in 1 minute and 31 seconds. Click for details.
  • Reviewed 466 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_aYS0BDkI9BO8NifI

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

# print(f"Changed files: {changed_files}", file=sys.__stdout__)
selected_models, manifest, catalog = generate_partial_manifest_catalog(changed_files, base_path=base_path)
# print("se1ected models", selected_models, file=sys.__stdout__)
def extract_arguments(args) -> Tuple[str, str, str, str, str, str, str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type annotation mismatch: 'extract_arguments' returns 8 values but its annotation indicates a 7-tuple. Please update the return type to reflect all returned items.

Suggested change
def extract_arguments(args) -> Tuple[str, str, str, str, str, str, str]:
def extract_arguments(args) -> Tuple[str, str, str, str, str, str, str, str]:

…return value

- Modified the `extract_arguments` function to return an extra string, enhancing the argument handling capabilities.
- This change supports future extensions and improves flexibility in argument parsing.
@gaurpulkit gaurpulkit merged commit 780052f into main Aug 21, 2025
57 checks passed
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 38fadc3 in 1 minute and 21 seconds. Click for details.
  • Reviewed 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_ThGWcEBEVLBvqq6t

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

selected_models, manifest, catalog = generate_partial_manifest_catalog(changed_files, base_path=base_path)
# print("se1ected models", selected_models, file=sys.__stdout__)
def extract_arguments(args) -> Tuple[str, str, str, str, str, str, str, str]:
"""Extract and return common arguments from parsed args."""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated return type now includes 8 strings (catalog_path added). Update the docstring to list all returned values.

Suggested change
"""Extract and return common arguments from parsed args."""
"""Extract and return config_name, token, instance_name, backend_url, config_path, base_path, manifest_path, and catalog_path from parsed args."""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants