Skip to content

Conversation

@ryanseq-gyg
Copy link
Contributor

@ryanseq-gyg ryanseq-gyg commented Nov 21, 2025

Description

1. Tagging system

This PR introduces a tagging system for filtering expectations, enabling selective execution based on priority, environment, or custom categories. Tags use "key:value" format with OR (TagMatchMode.ANY) and AND (TagMatchMode.ALL) matching. The implementation converts core models to Pydantic for type safety, adds the TagMatchMode enum for compile-time validation, and replaces unsafe assertions with production-safe type guards.

Examples

Tag expectations:

suite.expect_value_greater_than(column_name="age", value=18, tags=["priority:high", "env:prod"])
suite.expect_value_not_null(column_name="email", tags=["priority:medium"])

Filter with OR logic (ANY tag matches):

from dataframe_expectations import TagMatchMode

runner = suite.build(tags=["priority:high", "priority:medium"], tag_match_mode=TagMatchMode.ANY)

Filter with AND logic (ALL tags match):

runner = suite.build(tags=["priority:high", "env:prod"], tag_match_mode=TagMatchMode.ALL)

2. Suite Execution Results

The runner.run() method now supports a raise_on_failure parameter (default True) to control error handling behavior. When set to False, it always returns a SuiteExecutionResult object (whether tests pass or fail) instead of raising exceptions, enabling programmatic inspection of validation outcomes. The SuiteExecutionResult is a Pydantic model that captures comprehensive execution metadata including applied tag filters, match mode, timing information, pass/fail counts, individual expectation results with violation samples, and dataframe metadata (type, row count, caching status).

Example:

# Get results without raising exceptions
result = runner.run(df, raise_on_failure=False)

# Inspect the results programmatically
print(f"Total expectations: {result.total_expectations}")
print(f"Passed: {result.total_passed}, Failed: {result.total_failed}")
print(f"Pass rate: {result.pass_rate:.2%}")
print(f"Applied filters: {result.applied_filters}")
print(f"Tag match mode: {result.tag_match_mode}")

# Access individual expectation results
for exp_result in result.results:
    if exp_result.status == "failed":
        print(f"Failed: {exp_result.description}")
        print(f"Violation count: {exp_result.violation_count}")

Checklist

  • Tests have been added in the prescribed format
  • Commit messages follow Conventional Commits format
  • Pre-commit hooks pass locally

@ryanseq-gyg ryanseq-gyg marked this pull request as ready for review November 22, 2025 12:27
@ryanseq-gyg ryanseq-gyg requested a review from a team as a code owner November 22, 2025 12:27
@ryanseq-gyg ryanseq-gyg requested a review from Copilot November 22, 2025 12:28
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive tagging system and suite execution result tracking to the dataframe expectations framework. The key additions include:

  • A TagSet class for organizing and filtering expectations using "key:value" format tags
  • A SuiteExecutionResult class for capturing detailed validation outcomes including timing, status, and violation samples
  • Support for filtering expectations at runtime using tag-based criteria (with "any" or "all" logic)
  • Parallel test execution via pytest-xdist
  • Migration from custom logging_utils to standard Python logging

Key Changes

  • Added tag filtering to suite execution with configurable match modes
  • Replaced void suite returns with detailed SuiteExecutionResult objects
  • Added comprehensive test coverage for new tagging and result functionality
  • Updated all expectation constructors to accept optional tags parameter

Reviewed changes

Copilot reviewed 57 out of 58 changed files in this pull request and generated no comments.

Show a summary per file
File Description
dataframe_expectations/core/tagging.py New TagSet class for tag management and filtering
dataframe_expectations/core/suite_result.py New result models for capturing execution outcomes
dataframe_expectations/suite.py Enhanced suite runner with tag filtering and result tracking
dataframe_expectations/suite.pyi Updated type stubs with new signatures
dataframe_expectations/core/expectation.py Added tags support to base expectation class
dataframe_expectations/core/column_expectation.py Added tags parameter to column expectation constructor
dataframe_expectations/core/aggregation_expectation.py Added tags parameter to aggregation expectation constructor
dataframe_expectations/expectations/**/*.py Updated all expectation factories to accept tags parameter
tests/**/*.py Updated 30+ test files to validate new SuiteExecutionResult returns
tests/base/test_tagging.py New comprehensive tests for TagSet functionality
tests/base/test_suite_with_tagging.py New tests for suite tag filtering
tests/base/test_suite_result.py New tests for result models
scripts/sanity_checks.py Enhanced with tags parameter validation
scripts/generate_suite_stubs.py Updated to include tags in generated stubs
pyproject.toml, uv.lock Added pytest-xdist>=3.0.0 for parallel testing
.github/workflows/main.yaml Enabled parallel test execution

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ryanseq-gyg ryanseq-gyg merged commit e36becd into main Nov 22, 2025
8 checks passed
@ryanseq-gyg ryanseq-gyg deleted the improve-result-handling branch November 22, 2025 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants