Prototype Dataset Validation Using Pandera

## Description

This experiment will prototype a minimal dataset validation mechanism using Pandera to assess feasibility, ergonomics, and architectural fit.

The primary goal is to evaluate whether dataset-level validation can align with the direction established by parameter validation in a clean, Kedro-native way. Pandera will be used as a concrete backend for the experiment.

## Context

Kedro is introducing first-class parameter validation. As part of exploring a cohesive validation strategy, we want to evaluate whether a similar approach can be extended to dataset-level validation.

Currently, data validation is commonly implemented via hooks and third-party libraries, which can introduce hidden control flow and architectural misalignment.

## Scope

Implement a minimal prototype that:
- Integrates Pandera-based validation at the dataset level
- Triggers validation at load() (optional: save())
- Avoids using hooks
- Is tested within a simple example project

Explore whether:

- A dataset wrapper approach is sufficient
- The validation logic can conceptually align with the parameter validation structure

Deliverable
- A minimal proof of concept sufficient to evaluate the approach
- Short write-up covering:
    - Architectural fit with Kedro
    - (Interaction with lazy datasets) - Based on this comment (https://github.com/kedro-org/kedro/pull/5142#issuecomment-3497677477)
    - Developer experience
    - Limitations / trade-offs





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype Dataset Validation Using Pandera #5391

Description

Context

Scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prototype Dataset Validation Using Pandera #5391

Description

Description

Context

Scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions