Skip to content

Conversation

@ryanseq-gyg
Copy link
Contributor

@ryanseq-gyg ryanseq-gyg commented Nov 10, 2025

Description

🎯 Overview

This PR restructures the dataframe-expectations codebase to improve maintainability, clarity, and follows better Python package organisation patterns. This is a breaking change that reorganises the module structure while maintaining full backward compatibility of the user-facing API.

‼️ Migration guidelines

Most of the restructuring affects internal files. For users the only migration needed would be to update the DataFrameExpectationsSuite import.

Before:

from dataframe_expectations.expectations_suite import DataFrameExpectationsSuite

suite = DataFrameExpectationsSuite()
suite.expect_value_greater_than(column_name="age", value=18)

After:

from dataframe_expectations.suite import DataFrameExpectationsSuite

suite = DataFrameExpectationsSuite()
suite.expect_value_greater_than(column_name="age", value=18)  # API unchanged

Module Reorganisation

Before:

dataframe_expectations/
├── expectations/
│   ├── expectation.py
│   ├── column_expectation.py
│   ├── aggregation_expectation.py
│   ├── expectation_registry.py
│   ├── column_expectations/
│   └── aggregation_expectations/
└── expectations_suite.py

After:

dataframe_expectations/
├── core/                           # Framework internals
│   ├── expectation.py
│   ├── column_expectation.py
│   ├── aggregation_expectation.py
│   ├── types.py
│   └── utils.py
├── expectations/                   # Expectation implementations
│   ├── column/                    # Column expectations
│   │   ├── any_value.py
│   │   ├── numerical.py
│   │   └── string.py
│   └── aggregation/               # Aggregation expectations
│       ├── any_value.py
│       ├── numerical.py
│       └── unique.py
├── registry.py                    # Central registry (top-level)
└── suite.py                       # Suite class (top-level)

Moving forward we use the following guidelines:

  • core/ - Framework internals (base classes, types, utilities)
  • expectations/ - Expectation implementations organized by type
  • Top-level modules (registry.py, suite.py) - Public APIs

Registry refactoring

In addition to reorganising the code structure, the registry has also been refactored, where I consolidated 3 separate dictionaries into 2 optimised structures.

Version Management

Implemented robust version handling using importlib.metadata:

dataframe_expectations.__version__ # now returns package version

Checklist

  • Tests have been added in the prescribed format
  • Commit messages follow Conventional Commits format
  • Pre-commit hooks pass locally

@ryanseq-gyg ryanseq-gyg marked this pull request as ready for review November 10, 2025 15:15
@ryanseq-gyg ryanseq-gyg requested a review from a team as a code owner November 10, 2025 15:15
@ryanseq-gyg ryanseq-gyg requested a review from Copilot November 10, 2025 15:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR performs a major refactoring of the module structure, bumping the version from 0.2.0 to 0.3.0. The refactoring reorganizes the codebase by introducing a core/ module for base classes and moving registry functionality to the top level, improving the overall architecture and import paths.

Key changes:

  • Introduced dataframe_expectations/core/ module containing base classes (expectation.py, column_expectation.py, aggregation_expectation.py, types.py, utils.py)
  • Moved registry from expectations.expectation_registry to top-level registry.py with optimized lookup structure
  • Renamed expectations_suite.py to suite.py for simpler naming
  • Updated all import paths across tests, documentation, and scripts to reflect the new structure

Reviewed Changes

Copilot reviewed 66 out of 70 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
uv.lock Version bump from 0.2.0 to 0.3.0
dataframe_expectations/suite.py Moved from expectations_suite.py, updated imports and simplified dynamic method creation
dataframe_expectations/suite.pyi Auto-generated stub file with type hints for IDE support
dataframe_expectations/registry.py Refactored registry with optimized dual-dictionary lookup structure
dataframe_expectations/core/*.py New core module with base classes and types extracted from old structure
dataframe_expectations/expectations/*.py Updated imports to use new core module paths
tests/*.py Updated all test imports to reflect new module structure
docs/*.rst Updated documentation examples and API references with new import paths
scripts/*.py Updated scripts to use new import paths
README.md Updated example code with new import paths

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ryanseq-gyg ryanseq-gyg merged commit 111bca1 into main Nov 10, 2025
13 checks passed
@ryanseq-gyg ryanseq-gyg deleted the refactor/restructure-codebase branch November 10, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants